the informal ramblings of a formal language researcher

Wednesday, July 29, 2009

(no) interpreters on iphone

I'm collecting blog posts and web pages discussing clause 3.3.2 of the iPhone SDK EULA and semi-related restrictions on iPhone apps.

mobilephonedevelopment article

rhomobile's interpretation of no interpreters rule

story of how a Basic interpreter got downgraded to a calculator (and how to bring the interpreter back if you're willing to jailbreak).

PhoneGap Discussion of arbitrariness of rejection

MacNN coverage of the PhoneGap issue.

Commodore 64 emulator rejection article

FrotZ got through the filter; I asked the developer about that in this discussion

Tuesday, July 21, 2009

git-svn argh!

CCIS decommissioned a number of its Solaris machines recently.

I was using one of them as the point of access for the Larceny SVN repository, via the svn+ssh protocol.

More recently I have been playing with git-svn, using that as my way to interact with the repository. I made a local git-svn clone of the Larceny SVN repository, and then I would clone the clone and make branches in the cloned clones as part of my experimentation with new ideas. Fun fun fun.

Except that CCIS decommissioned a number of its Solaris machines recently, and I was using one of them as the point of access, which mean that the repository url looked something like svn+ssh://zulu.ccs.neu.edu/path-to-repos/, and git-svn (somewhat reasonably) assumes that these URLS would remain valid forever, and puts a copy of that URL into every log message of the clone.

Now, its easy to update those log messages; that's what git-filter-branch is for. (See also this related blog entry.)

But what's not so easy is dealing with the sudden split in the tree structure of the development on my cloned clones; all of my local deltas seem to point to the old changesets with the old bad URL. And "git rebase" does not seem to be magically able to handle this case.

Thus: git-svn argh!

If the SVN folks though that a "svn switch --relocate" command was important, maybe the git-svn developer(s) should also consider their design decisions...

Monday, January 12, 2009

Zecreasing Zerived Zvariables

This is a great little story. And probably something I could explain to my Mom.

http://www.freedom-to-tinker.com/blog/felten/debugging-zune-blackout

(plus I've had this obsession with date-oriented code lately...)

(p.s. can one even pronounce "Zvariables"?)

Friday, January 09, 2009

5 or 6 forced restarts later...

I wasted some time this evening trying to use gprof to profile Larceny.

Eventually I discovered that the -pg option simply does not work with gcc on Intel Macs. (Why on earth doesn't the man page for gprof say something about this?)

Apple instead recommends that one use Shark or Saturn, both provided when you install the CHUD Tools (Computer Hardware Understanding Developer Tools)

So I did that. I downloaded a copy of the CHUD Tools (and as far as I can remember, I did it by searching on ADC for them, and got a copy of chud.dmg). I was a little surprised by the file times (they said something like 2007) but I pressed on, eager to start seeing some profiler output, especially since all of my searches to read about Shark sounded very positive about it.

And then when I attempted to compile with -finstrument-functions, so I could use Saturn in the same way that I might have used gprof, I got a link error:

Undefined symbols:
"___cyg_profile_func_exit", referenced from:
_consolemsg in larceny.o
_consolemsg in larceny.o
...
"___cyg_profile_func_enter", referenced from:
_panic_abort in larceny.o
_panic_exit in larceny.o
...


I abandoned further attempts to use Saturn pretty quickly.

So then I tried Shark. And that was, if anything, worse. Because every time I tried to stop a profiling session, it would restart my machine!

Eventually I discovered (though not easily) this discussion post.

Hmm. CHUD 4.4.4 panics. I look at my version. Yep, its 4.4.4.

So I did a more careful search for a more recent version of CHUD (4.6.1, specifically). We'll see how that works out for me in a bit.

(I'm mostly posting this so that other people who run into a similar problem know that they need to get a newer version of CHUD. Maybe later I'll try to retrace my steps that led me to thinking that CHUD 4.4.4 was the version I needed to use...)

  • I do not know who linked to it, but some page (that I believe was on the Apple site) sent me to the follow ftp link, which has the version of CHUD that is incompatible with newer versions of OS X: ftp://ftp.apple.com/developer/Tool_Chest/Testing_-_Debugging/Performance_tools/CHUD_4.4.4.dmg

Monday, December 15, 2008

CFER : "suitable for blogs"

ICFP 2009 has a "Call For Experience Reports", and specifically mentions that its suitable for posting on blogs.

So here goes. :)

ICFP 2009: CFER

Wednesday, July 02, 2008

data structures, network outages, and php

The most "productive" portion of my day was spent prototyping the interface to summary set matrix that I will be putting into the regional collector RSN. I have the header file worked out and started working on the representation itself. I got part way through writing a constructor function before deciding that I wanted to think a bit more about how I want to handle allocation and deallocation of the entries in the sparse matrix.

(Some of that time was spent reviewing the existing source for the Larceny runtime, especially the different attribute bits one can put on pages of memory in Larceny, and the history of when they were introduced and/or shifted around by Lars...)

I also had a back-and-forth with the Systems staff since artichoke and poblano became inaccessible from the outside world at some point yesterday evening.

I finished off the day trying to figure out why my GC benchmarking script is still failing on the dynregion/ branch. I have more of a clue about it now, but am pretty unhappy about how much of a kludge this supposedly simpler script is becoming. (Maybe I should give up doing these things in shell script and just do them as Larceny scripts.)

Monday, June 30, 2008

hi after hiatus

This weekend I mentioned that I had a blog to my family, and then showed the url to Stephanie.

I proceeded to become engrossed in reading over my old posts; I had forgotten some of the interesting software experiments I had been posting online.

Of course, I have not made a post in almost two years. I have since switched to keeping an internal work diary, so that I would not worry about giving away the farm (or embarrassing outcomes of my day to day research), and also posting any interesting links that I wanted to see later to my del.icio.us account

But I realized after looking at the old entries that I liked broadcasting my progress.

So I am going to try to maintain this blog again, and see if I can figure out how to distill my daily work diary into something small (and publicly distributable) each day (or perhaps every other day or so... who knows).

So, today:

  • Rasputin's disk was full; I had to wrestle with that for a little while. It seems like each disk fills up on such a regular basis that I really should consider fixing the autobuild scripts to clean up after themselves or to compress their generated directories or both...

  • I finally checked in a fix to Larceny's simplistic stopcopy collector on the IAssassin backend so that when allocation fills up a chunk in a semispace, the runtime first attempts to switch the allocation pointer to a free chunk within the semispace rather than attempt a whole heap collection. This is a crucial fix, because the stopcopy collector's performance had asymptotically terrible performance before, which affected both our benchmark results and some of our users who wanted to load a lot of code before performing a heap dump.

  • I spent some time trying to double-check the runtime source code's make dependencies, because I thought I saw evidence that not everything was being rebuilt properly in response to changes I made for fixing the stopcopy collector. Unfortunately I was unable to pin down any particular missing dependency.

  • I tried generalizing the nightly GC benchmarking script so that it could also benchmark my dynregion development branch. But then my attempts to just double-check the behavior of the GC benchmarking script itself failed, and my investigations there led me to file Ticket #547 in Larceny's trac database.



I also spent some time browsing the web/google groups/etc. I added some links to my aforementioned del.icio.us account. I don't think I'll be explicitly mentioning such activity on my part in the future.

Sunday, September 10, 2006

Oh hdiutil, what can't you do?

(orig article)

The punchline:

hdiutil convert image.dmg -format UDTO -o image.iso

Although this makes a file with the extension .cdr, the article claims that it is an iso file and can be renamed as such.

This allows me to use the OS X Disk Utility to create .iso images that can then be mounted from Parallels desktop or VMware.

UPDATE:
The above article seems to be a complete lie, but at least it pointed me at hdiutil... the above approach makes iso images that Mac OS X can mount, but do not work within Parallels.

However, the following command seems like it does the job:

hdiutil makehybrid -o image image.dmg

Tuesday, August 08, 2006

stupid LEA tricks

LEA is the "load effective address" instruction on x86 processors.

Despite the "L" in the name, this is solely an arithmetic instruction. Which means there are some nasty little games you can play with it. Like this one I got courtesy of Frank Kotler from:

http://groups.google.com/group/alt.lang.asm/msg/27d6ed6183448057?hl=en&


global _start

section .data
number_string db '123', 10

section .text
_start:
nop

push number_string
call atoi
add esp, byte 4

mov ebx, eax
mov eax, 1
int 80h

atoi:
mov edx, [esp + 4] ; pointer to string
xor eax, eax ; clear "result"
.top:
movzx ecx, byte [edx]
inc edx
cmp ecx, byte '0'
jb .done
cmp ecx, byte '9'
ja .done

; we have a valid character - multiply
; result-so-far by 10, subtract '0'
; from the character to convert it to
; a number, and add it to result.

lea eax, [eax + eax * 4]
lea eax, [eax * 2 + ecx - 48]

jmp short .top
.done
ret

Tuesday, July 18, 2006

free smalltalk! (books, that is)

Just discovered this page today:On a related note, here are some other free books online that I wasn't aware of until recently:

Wednesday, May 03, 2006

321 contact!

Larceny's just ran x86 code in heap for the first time!

The initial scheme code:
(define seg (assemble (compile 321)))
(define env (environment-copy (interaction-environment)))
(define cseg (sassy-postpass-segment seg))
(define linked (link-lop-segment cseg env))

Note that the "program" we're compiling here is just evaluating the literal 321. Not very exciting to interpret, but to run off the heap, marvelous!

The (insane) hack on the x86 implementation of MacScheme's invoke:
 dec dword [GLOBALS+G_TIMER]
jnz short %%L1
%%L0: mcall M_INVOKE_EX
jmp short %%L1
%%L2: storer 0, RESULT
const2regf RESULT, fixnum(%1)
int3 ;; debugging interrupt
;;; (the lack of space is significant; gdb ignores "int 3")
add TEMP,-BVEC_TAG+BVEC_HEADER_BYTES
jmp TEMP
%%L1: lea TEMP, [RESULT+(8-PROC_TAG)]
test TEMP_LOW, tag_mask
jnz short %%L0
;; Felix inserted checks if we actually have a
;; fixnum here...
mov TEMP, [TEMP-8+PROC_CODEVECTOR_NATIVE]
test TEMP_LOW, fixtag_mask
jnz short %%L2
storer 0, RESULT
const2regf RESULT, fixnum(%1)
jmp TEMP

The heart of the hack is that invoke is now dispatching based on the kind of code pointer it got. This should allow me to keep using Petit for the heart of the system, but still test the code in the heap directly.
  • For fixnums (which are actually addresses into the text segment, but don't tell the garbage collector), we do the old invocation routine.

  • For bytevector pointers (which should point at codevectors in the heap), strip the tag off the pointer, then add the offset to skip over the bytevector header (on the heap object itself). Finally, JUMP!

And, at long last, the actual result:
(gdb) r -heap sassy.heap
Starting program: /home/pnkfelix/larcenydev/trunk/larceny_src/twobit.bin -heap sassy.heap
(no debugging symbols found)...(no debugging symbols found)...(no debugging symbols found)...(no debugging symbols found)...LARCENY_ROOT not set; using current directory
Larceny v0.91 "Children's Ice Cream" (May 2 2006 10:41:54, precise:Linux:unified)


> (load "testsass.scm")

> (linked)

Program received signal SIGTRAP, Trace/breakpoint trap.
0x08342d49 in ..@6866.L2 ()
(gdb) stepi
0x08342d4c in ..@6866.L2 ()
(gdb) stepi
0x00767454 in ?? ()
(gdb) c
Continuing.
321

Did you see it??? Did you??? That 321 at the end, that was produced by code in the heap. (Well, the value was; the actual action of printing the value is still happening from code in the text segment).

Woo hoo!!!

Wednesday, February 01, 2006

on parallelizing a garbage collector

So here are the lessons I learned from attempting to parallelize a stop-and-copy garbage collector, as implemented in the Larceny runtime, using Cilk's primitives as the basis for the parallelization.

At a high level, based on Flood's paper on parallel collection on SMP machines, I decided to start with a naive recursive stop-and-copy collection algorithm, and parallelize that using Cilk's spawn and sync primitives. (The Flood paper points out the utility of a work-stealing approach to load-balancing, which the Cilk runtime gives you "for free.")

Of course, the devil is in the details. First of all, all I had to start with was an kernel collector that uses Cheney's algorithm, which is not trivial to parallelize (as far as I can tell). So I had to develop a recursive algorithm that behaved in a manner compatible with the behavior of the Cheney collector (this is not as trivial as it sounds; the devil's in the details).

Even after I developed the recursive algorithm, I discovered a new set back: the Cilk language, as currently documented, has a restriction that you can only call Cilk functions from other Cilk functions. That is, Cilk functions can call out to C functions, but not vice versa.

This immediately leads to a huge problem for Larceny, because the Scheme code is meant to be able to call into the garbage collector. So that means that the Scheme code needs to be output as Cilk code, not C code. And in fact, the entire runtime and compiler system needs to be shifted to use Cilk instead of C. ACK!

Luckily, the very newest version of Cilk has a beta interface for calling Cilk functions from C code. Unluckily, the interface is indeed beta, and there were some bugs that were showstoppers for me (in hindsight, there were workarounds for the bugs, but that's neither here nor there).

So, here's the important lessons from the project:

  1. Reverse engineering is hard. Even when you're going from a complex algorithm to a simple one, there's always details in "real systems" that get overlooked. (For me, it was particular invariants in terms of the Cheney algorithm deliberately not traversing certain structures at certain times, and it was overloading the generation bits to control this, even though it was still relevant even for non-generational collection.

  2. If you want your language extension/variant A to interoperate with language B, it needs to go in both directions. You may think it will suffice to have one layered on top of the other, but in the end, you will be wrong, and your users will hate you. You're better off designing that interoperability in from the beginning. (This is relevant for languages that seek to interoperate with the CLR or the JVM, in that your extension language should define classes that themselves can be invoked or extended by terms in C# or Java, respectively.

Tuesday, January 03, 2006

They ID'ed me 100%

Bourbon
Congratulations! You're 118 proof, with specific scores in beer (20) , wine (100), and liquor (78).
Screw all that namby-pamby chick stuff, you're going straight for the bottle and a shot glass! It'll take more than a few shots of Wild Turkey or 99 Bananas before you start seeing pink elephants. You know how to handle your alcohol, and yourself at parties.



My test tracked 4 variables How you compared to other people your age and gender:
free online datingfree online dating
You scored higher than 30% on proof
free online datingfree online dating
You scored higher than 50% on beer index
free online datingfree online dating
You scored higher than 84% on wine index
free online datingfree online dating
You scored higher than 66% on liquor index

Link: The Alcohol Knowledge Test written by hoppersplit on Ok Cupid, home of the 32-Type Dating Test

Sunday, November 13, 2005

Swapping Ctrl/CapsLock (PCs or Macs!)

Someone else cares about this, and he cares enough to catalogue how to do it on all sorts of machines!
Just go here.

(hmm. Actually, that was a total lie; the link above only treats windows and linux machines. Well, this guys says how to do it on 10.4 machines.)

Thursday, November 03, 2005

some windows stuff worth knowing.


  • This page tells you how to get free copies of the command line development tools that are including in Visual Studio. That's right, get yourself the C/C++ compiler and header files (as well as some batch scripts to set up your environment properly), all direct from Microsoft.


    • .NET Framework SDK (for the compiler and linker).

    • Platform SDK (for the header files).

    • As a side note to this, I had to learn about the call statement for DOS Batch scripts in order to learn how to make a batch file that would call each of the batch files in sequence, because each of the above SDK has a different batch file to set up the environment.


  • This page tells you how to edit the Windows Registry to change the behavior of the Caps Lock key. For a Emacs user like me, this was a crucial thing to learn, since the control key is really hard to get at on modern Windows laptops. Speaking of which . . .

  • This page tells you how to install Emacs on a Windows machine. It works well enough.

  • Cygwin. Nuff said.

Wednesday, August 31, 2005

GC'ing classes

.NET as far as I can tell, does not garbage collect unreachable code. At best, you can try to manually manage the memory associated with dynamically loaded code by loading into separate AppDomains that you unload by hand. I have not really experimented with this option.

I was discussing this problem with a friend, who asserted that Java has the same problem.

To prove him wrong, I wrote the following class. You run it on the command line, passing an numeric argument that indicates the number of distinct classes you want to load. Try it out with and without the -Xnoclassgc option in Java!

import java.lang.Integer;
import java.lang.ClassLoader;
import java.lang.Class;
import java.lang.ClassNotFoundException;

public class ClassSFS {

public static void println(String s) {
System.out.println(s);
}
public static void println() {
System.out.println();
}
public static void print(String s) {
System.out.print(s);
}
public static void main(String[] args) {
println("Hello World");
int num_classes = Integer.parseInt(args[0]);
initbytes();
for(int i = 0; i < num_classes; i++) {

Tbytes[ 48 + 5 ] = (byte) (0x61 + (i/100) % 26);
Tbytes[ 48 + 6 ] = (byte) (0x61 + (i/10) % 26);
Tbytes[ 48 + 7 ] = (byte) (0x61 + (i/1) % 26);

CLoader cl = new CLoader();
try {
Class c = cl.findClass("T");
Object x = c.newInstance();
System.out.println("ClassLoader "+i+", x:"+x);
} catch (ClassNotFoundException e) {
System.out.println("ClassLoader "+i+", ClassNotFound e:"+e);
} catch (InstantiationException e) {
System.out.println("ClassLoader "+i+", InstantiationException e:"+e);
} catch (IllegalAccessException e) {
System.out.println("ClassLoader "+i+", IllegalAccessException e:"+e);
}
}
}

static class CLoader extends ClassLoader {
CLoader() { super(); }
public Class findClass(String name) throws ClassNotFoundException {
return
super.defineClass(name,
Tbytes,
0,
Tbytes.length);
}
}


private static int[] iTbytes = {

0xca, 0xfe, 0xba, 0xbe, 0x00, 0x00, 0x00, 0x2e, 0x00, 0x20, 0x0a, 0x00, 0x0a, 0x00, 0x16, 0x07,
0x00, 0x17, 0x0a, 0x00, 0x02, 0x00, 0x16, 0x08, 0x00, 0x18, 0x0a, 0x00, 0x02, 0x00, 0x19, 0x09,
0x00, 0x09, 0x00, 0x1a, 0x0a, 0x00, 0x02, 0x00, 0x1b, 0x08, 0x00, 0x0b, 0x07, 0x00, 0x1c, 0x07,
/* f e e */
0x00, 0x1d, 0x01, 0x00, 0x03, 0x66, 0x65, 0x65, 0x01, 0x00, 0x12, 0x4c, 0x6a, 0x61, 0x76, 0x61,
0x2f, 0x6c, 0x61, 0x6e, 0x67, 0x2f, 0x53, 0x74, 0x72, 0x69, 0x6e, 0x67, 0x3b, 0x01, 0x00, 0x06,
0x3c, 0x69, 0x6e, 0x69, 0x74, 0x3e, 0x01, 0x00, 0x03, 0x28, 0x29, 0x56, 0x01, 0x00, 0x04, 0x43,
0x6f, 0x64, 0x65, 0x01, 0x00, 0x0f, 0x4c, 0x69, 0x6e, 0x65, 0x4e, 0x75, 0x6d, 0x62, 0x65, 0x72,
0x54, 0x61, 0x62, 0x6c, 0x65, 0x01, 0x00, 0x08, 0x74, 0x6f, 0x53, 0x74, 0x72, 0x69, 0x6e, 0x67,
0x01, 0x00, 0x14, 0x28, 0x29, 0x4c, 0x6a, 0x61, 0x76, 0x61, 0x2f, 0x6c, 0x61, 0x6e, 0x67, 0x2f,
0x53, 0x74, 0x72, 0x69, 0x6e, 0x67, 0x3b, 0x01, 0x00, 0x08, 0x3c, 0x63, 0x6c, 0x69, 0x6e, 0x69,
0x74, 0x3e, 0x01, 0x00, 0x0a, 0x53, 0x6f, 0x75, 0x72, 0x63, 0x65, 0x46, 0x69, 0x6c, 0x65, 0x01,
0x00, 0x06, 0x54, 0x2e, 0x6a, 0x61, 0x76, 0x61, 0x0c, 0x00, 0x0d, 0x00, 0x0e, 0x01, 0x00, 0x16,
0x6a, 0x61, 0x76, 0x61, 0x2f, 0x6c, 0x61, 0x6e, 0x67, 0x2f, 0x53, 0x74, 0x72, 0x69, 0x6e, 0x67,
0x42, 0x75, 0x66, 0x66, 0x65, 0x72, 0x01, 0x00, 0x03, 0x54, 0x3c, 0x3e, 0x0c, 0x00, 0x1e, 0x00,
0x1f, 0x0c, 0x00, 0x0b, 0x00, 0x0c, 0x0c, 0x00, 0x11, 0x00, 0x12, 0x01, 0x00, 0x01, 0x54, 0x01,
0x00, 0x10, 0x6a, 0x61, 0x76, 0x61, 0x2f, 0x6c, 0x61, 0x6e, 0x67, 0x2f, 0x4f, 0x62, 0x6a, 0x65,
0x63, 0x74, 0x01, 0x00, 0x06, 0x61, 0x70, 0x70, 0x65, 0x6e, 0x64, 0x01, 0x00, 0x2c, 0x28, 0x4c,
0x6a, 0x61, 0x76, 0x61, 0x2f, 0x6c, 0x61, 0x6e, 0x67, 0x2f, 0x53, 0x74, 0x72, 0x69, 0x6e, 0x67,
0x3b, 0x29, 0x4c, 0x6a, 0x61, 0x76, 0x61, 0x2f, 0x6c, 0x61, 0x6e, 0x67, 0x2f, 0x53, 0x74, 0x72,
0x69, 0x6e, 0x67, 0x42, 0x75, 0x66, 0x66, 0x65, 0x72, 0x3b, 0x00, 0x21, 0x00, 0x09, 0x00, 0x0a,
0x00, 0x00, 0x00, 0x01, 0x00, 0x0a, 0x00, 0x0b, 0x00, 0x0c, 0x00, 0x00, 0x00, 0x03, 0x00, 0x01,
0x00, 0x0d, 0x00, 0x0e, 0x00, 0x01, 0x00, 0x0f, 0x00, 0x00, 0x00, 0x1d, 0x00, 0x01, 0x00, 0x01,
0x00, 0x00, 0x00, 0x05, 0x2a, 0xb7, 0x00, 0x01, 0xb1, 0x00, 0x00, 0x00, 0x01, 0x00, 0x10, 0x00,
0x00, 0x00, 0x06, 0x00, 0x01, 0x00, 0x00, 0x00, 0x01, 0x00, 0x01, 0x00, 0x11, 0x00, 0x12, 0x00,
0x01, 0x00, 0x0f, 0x00, 0x00, 0x00, 0x2e, 0x00, 0x02, 0x00, 0x01, 0x00, 0x00, 0x00, 0x16, 0xbb,
0x00, 0x02, 0x59, 0xb7, 0x00, 0x03, 0x12, 0x04, 0xb6, 0x00, 0x05, 0xb2, 0x00, 0x06, 0xb6, 0x00,
0x05, 0xb6, 0x00, 0x07, 0xb0, 0x00, 0x00, 0x00, 0x01, 0x00, 0x10, 0x00, 0x00, 0x00, 0x06, 0x00,
0x01, 0x00, 0x00, 0x00, 0x04, 0x00, 0x08, 0x00, 0x13, 0x00, 0x0e, 0x00, 0x01, 0x00, 0x0f, 0x00,
0x00, 0x00, 0x1e, 0x00, 0x01, 0x00, 0x00, 0x00, 0x00, 0x00, 0x06, 0x12, 0x08, 0xb3, 0x00, 0x06,
0xb1, 0x00, 0x00, 0x00, 0x01, 0x00, 0x10, 0x00, 0x00, 0x00, 0x06, 0x00, 0x01, 0x00, 0x00, 0x00,
0x02, 0x00, 0x01, 0x00, 0x14, 0x00, 0x00, 0x00, 0x02, 0x00, 0x15

};

private static byte[] Tbytes;

public static void initbytes()
{
Tbytes = new byte[ iTbytes.length ];
for(int i = 0; i < iTbytes.length; i++) {
Tbytes[i] = (byte)(iTbytes[i]);
}

print(""+nybble2char((Tbytes[0]>>4) & 0xF));
print(""+nybble2char((Tbytes[0]>>0) & 0xF));
print(""+nybble2char((Tbytes[1]>>4) & 0xF));
print(""+nybble2char((Tbytes[1]>>0) & 0xF));
print(""+nybble2char((Tbytes[2]>>4) & 0xF));
print(""+nybble2char((Tbytes[2]>>0) & 0xF));
print(""+nybble2char((Tbytes[3]>>4) & 0xF));
print(""+nybble2char((Tbytes[3]>>0) & 0xF));
println();
}

private static char nybble2char(int b) {
switch (b) {
case 0xf: return 'f';
case 0xe: return 'e';
case 0xd: return 'd';
case 0xc: return 'c';
case 0xb: return 'b';
case 0xa: return 'a';
default: return (char) (b+'a');
}
}

public static int Tcounter = 0;
}


The iTbytes array was generated by compiling public class T { private static String fee = "fee"; public String toString() { return "T<>"+fee; }}, then loading the resulting class file into emacs, switching to hexl-mode, and doing some keyboard-macrology to convert it to something javac would accept.

Thursday, August 25, 2005

C#, pass-by-value, pass-by-reference

Here's a nice little snippet of C# for you.
using System;

namespace InterfaceSample {
public delegate void Changed();
interface IPoint {
int X { get; set; }
int Y { get; set; }
}

struct Point: IPoint {
private int xValue, yValue;
public int X { get { return xValue; } set { xValue = value; } }
public int Y { get { return yValue; } set { yValue = value; } }
}

public class EntryPoint {
public static int Main() {
String formatstr = " p1.X: {0}, p1.Y: {1}, p2.X: {2}, p2.Y: {3} ip1.X: {4}, ip1.Y: {5}, ip2.X: {6}, ip2.Y: {7}";

Point p1 = new Point();
p1.X = p1.Y = 42;
IPoint ip1 = p1;
Point p2 = (Point) ip1;
IPoint ip2 = ip1;

Console.WriteLine(formatstr, p1.X, p1.Y, p2.X, p2.Y, ip1.X, ip1.Y, ip2.X, ip2.Y);
p1.X = p1.Y = 21; Console.WriteLine("p1.X = p1.Y = 21;");
Console.WriteLine(formatstr, p1.X, p1.Y, p2.X, p2.Y, ip1.X, ip1.Y, ip2.X, ip2.Y);
ip1.X = ip1.Y = 84; Console.WriteLine("ip1.X = ip1.Y = 84;");
Console.WriteLine(formatstr, p1.X, p1.Y, p2.X, p2.Y, ip1.X, ip1.Y, ip2.X, ip2.Y);
return 0;
}
}
}
In C#, classes and interfaces can have properties, which have usage syntax similar to fields but semantics similar to methods. In the interface, you just declare the property name and whether it has getters/setters; in the class, you then define the behavior you want for the property.

In C#, you can declare struct types, which are value types in the language. This means that they are passed by value. However, they are not immutable.

In C#, a struct type can implement an interface. Ah, what fun.

Here's the output of the above program:
  p1.X: 42, p1.Y: 42, p2.X: 42, p2.Y: 42 ip1.X: 42, ip1.Y: 42, ip2.X: 42, ip2.Y: 42
p1.X = p1.Y = 21;
p1.X: 21, p1.Y: 21, p2.X: 42, p2.Y: 42 ip1.X: 42, ip1.Y: 42, ip2.X: 42, ip2.Y: 42
ip1.X = ip1.Y = 84;
p1.X: 21, p1.Y: 21, p2.X: 42, p2.Y: 42 ip1.X: 84, ip1.Y: 84, ip2.X: 84, ip2.Y: 84


The mutation to p's properties is not propagated over to ip, which looks odd to a Java programmer like me. This is because the assignment ip1 = p1 makes a copy of p1 when it "boxes" it into ip1.

And even odder, given the previous paragraph, the mutations to ip1 are carried over to ip2. Actually, this isn't so odd, since ip1 is an interface after all, that might ("must", considering boxing?) be implemented by an object, and therefore the assignment must copy a reference to the object.

Finally, you can copy from the interface back to a Point, but this requires a cast. This makes sense (see previous paragraph).

In the end, I don't think I have a big problem with value types. Its just the mutable value types that I get nervous about, because then you actually need to start thinking about the copy/reference semantics.

Wednesday, August 10, 2005

old stack based languages

We all know that I've been looking at Forth to digest its approach to allowing powerful compile-time constructions.

Fare, an LL-discuss member, mentioned POP-11 as an alternative to Forth (that might be even older). It seems to also have the ability to define compile-time programs; it remains to be seen how it compares to Forth (or Lisp/Scheme, for that matter).

heh


I just realized that my use of "We all know" up above is completely unfounded, because I didn't bother to blog during the month of July, which was when I was investigating Forth so heavily. I suppose I should finish the investigation (or at least both the books from the library) and write something up.

rules for intermediate representations

And the muse of compiler development did rise from her murky swamp, and did say unto the Larceny developers, "Thou shalt not convert your aye-arrh into object form, be it string, bytecode, or otherwise, until the last possible moment."

The Larceny developers did take exception to this rule, pleading "but we have chosen an invertible object form, from which the most exhalted client developer may extract the original structured aye-arrh. Its strings are formatted thusly, isomorphic to the structure of the input, and thus less painful for my mortal eyes to gaze upon than the radiance of the aye-arrh structure itself."

To this, the muse of compiler development rules responds, "verily, you might take such a path; but then you must also provide such inversion functions, and not place the onus of developing such functions upon the shoulders of the most exhalted client developer, who is already fed up with trying to make sense of your underspecified and confusingly named interfaces.

Here endeth the lesson.

Thursday, August 04, 2005

On Macros and JavaDot

Tonight I made a fun macro that tries to cut down on the verbosity when you refer to classes using Javadot; normally you have to explicit write out the full name with all the package prefixes. What I want is to introduce a nice shorthand, similar to the shorthand introduced by the import statement in Java.

Here is the macro:
;; (let/import ((X Y Z1 Z2 ...)) BODY ...) binds Y.Z1, Y.Z2, ... to
;; the expressions X.Y.Z1, X.Y.Z2, ..., and naturally generalizes to
;; more than one prefix X.
;; As one special case, if Zn is (), then that means import the constructor
;; X.Y. as the name Y. (note the period on the end).
(define-syntax let/import
(transformer (lambda (stx ren cmp)
(let ((bindings (cadr stx))
(body (cddr stx))
(construct-new-bindings
(lambda (binding)
(let* ((->s (lambda (x)
(if (null? x)
""
(symbol->string x))))
(prefix (->s (car binding)))
(middle (->s (cadr binding)))
(suffixes (map ->s (cddr binding)))
(s->/append (lambda l
(string->symbol (apply string-append l))))
(make-binding (lambda l
(list (apply s->/append l)
(symbol->javadot-symbol
(apply s->/append prefix "." l))))))
(map (lambda (suffix) (make-binding middle "." suffix))
suffixes)))))
`(,(ren 'let) (,@(apply append (map construct-new-bindings bindings)))
,@body)))))


This pretty much works.

However, using it seems to have exposed what I'd call a bug in how Common Larceny's macro expander interacts with Javadot.

Watch this:

> (let () System.Reflection.Emit.AssemblyBuilderAccess.RunAndSave$)
#<procedure of 0 arguments>

> (let () System.Reflection.Emit.AssemblyBuilderAccess.Run$)
#<procedure of 0 arguments>

> (let/import ((System.Reflection.Emit AssemblyBuilderAccess RunAndSave$)) AssemblyBuilderAccess.RunAndSave$)
#<procedure of 0 arguments>

> (let/import ((System.Reflection.Emit AssemblyBuilderAccess RunAndSave$)) System.Reflection.Emit.AssemblyBuilderAccess.Run$)
#<procedure of 0 arguments>

> (let/import ((System.Reflection.Emit AssemblyBuilderAccess RunAndSave$)) System.Reflection.Emit.AssemblyBuilderAccess.RunAndSave$)

Error: Reference to undefined global variable "system.reflection.emit.assemblybuilderaccess.runandsave$".

>


What huppen!?!

I dunno, but Ryan and I tried looking at the expanded output:

> (macro-expand '(let/import ((System.Reflection.Emit AssemblyBuilderAccess RunAndSave$)) System.Reflection.Emit.AssemblyBuilderAccess.Run$))
((lambda () ((lambda (.assemblybuilderaccess.runandsave$|4) (clr/find-static-field-getter '#f 'system.reflection.emit.assemblybuilderaccess.run)) (clr/find-static-field-getter '#f 'system.reflection.emit.assemblybuilderaccess.runandsave))))

> (macro-expand '(let/import ((System.Reflection.Emit AssemblyBuilderAccess RunAndSave$)) AssemblyBuilderAccess.RunAndSave$))
((lambda () ((lambda (.assemblybuilderaccess.runandsave$|4) .assemblybuilderaccess.runandsave$|4) (clr/find-static-field-getter '#f 'system.reflection.emit.assemblybuilderaccess.runandsave))))

> (macro-expand '(let/import ((System.Reflection.Emit AssemblyBuilderAccess RunAndSave$)) System.Reflection.Emit.AssemblyBuilderAccess.RunAndSave$))
((lambda () ((lambda (.assemblybuilderaccess.runandsave$|4) (clr/find-static-field-getter '#f 'system.reflection.emit.assemblybuilderaccess.runandsave)) system.reflection.emit.assemblybuilderaccess.runandsave$)))

>


Update


It turns out this isn't even a problem with Macros; it looks like its just a bug in our JavaDot implementation when you refer to the same identifier more than once. E.g.:
> (begin (display System.String.class) (display System.String.class))
#<System.RuntimeType System.String>
Error: Reference to undefined global variable "system.string.class".
So much for getting excited about some strange new bug...

Followers