the informal ramblings of a formal language researcher

Sunday, September 10, 2006

Oh hdiutil, what can't you do?

(orig article)

The punchline:

hdiutil convert image.dmg -format UDTO -o image.iso

Although this makes a file with the extension .cdr, the article claims that it is an iso file and can be renamed as such.

This allows me to use the OS X Disk Utility to create .iso images that can then be mounted from Parallels desktop or VMware.

UPDATE:
The above article seems to be a complete lie, but at least it pointed me at hdiutil... the above approach makes iso images that Mac OS X can mount, but do not work within Parallels.

However, the following command seems like it does the job:

hdiutil makehybrid -o image image.dmg

Tuesday, August 08, 2006

stupid LEA tricks

LEA is the "load effective address" instruction on x86 processors.

Despite the "L" in the name, this is solely an arithmetic instruction. Which means there are some nasty little games you can play with it. Like this one I got courtesy of Frank Kotler from:

http://groups.google.com/group/alt.lang.asm/msg/27d6ed6183448057?hl=en&


global _start

section .data
number_string db '123', 10

section .text
_start:
nop

push number_string
call atoi
add esp, byte 4

mov ebx, eax
mov eax, 1
int 80h

atoi:
mov edx, [esp + 4] ; pointer to string
xor eax, eax ; clear "result"
.top:
movzx ecx, byte [edx]
inc edx
cmp ecx, byte '0'
jb .done
cmp ecx, byte '9'
ja .done

; we have a valid character - multiply
; result-so-far by 10, subtract '0'
; from the character to convert it to
; a number, and add it to result.

lea eax, [eax + eax * 4]
lea eax, [eax * 2 + ecx - 48]

jmp short .top
.done
ret

Tuesday, July 18, 2006

free smalltalk! (books, that is)

Just discovered this page today:On a related note, here are some other free books online that I wasn't aware of until recently:

Wednesday, May 03, 2006

321 contact!

Larceny's just ran x86 code in heap for the first time!

The initial scheme code:
(define seg (assemble (compile 321)))
(define env (environment-copy (interaction-environment)))
(define cseg (sassy-postpass-segment seg))
(define linked (link-lop-segment cseg env))

Note that the "program" we're compiling here is just evaluating the literal 321. Not very exciting to interpret, but to run off the heap, marvelous!

The (insane) hack on the x86 implementation of MacScheme's invoke:
 dec dword [GLOBALS+G_TIMER]
jnz short %%L1
%%L0: mcall M_INVOKE_EX
jmp short %%L1
%%L2: storer 0, RESULT
const2regf RESULT, fixnum(%1)
int3 ;; debugging interrupt
;;; (the lack of space is significant; gdb ignores "int 3")
add TEMP,-BVEC_TAG+BVEC_HEADER_BYTES
jmp TEMP
%%L1: lea TEMP, [RESULT+(8-PROC_TAG)]
test TEMP_LOW, tag_mask
jnz short %%L0
;; Felix inserted checks if we actually have a
;; fixnum here...
mov TEMP, [TEMP-8+PROC_CODEVECTOR_NATIVE]
test TEMP_LOW, fixtag_mask
jnz short %%L2
storer 0, RESULT
const2regf RESULT, fixnum(%1)
jmp TEMP

The heart of the hack is that invoke is now dispatching based on the kind of code pointer it got. This should allow me to keep using Petit for the heart of the system, but still test the code in the heap directly.
  • For fixnums (which are actually addresses into the text segment, but don't tell the garbage collector), we do the old invocation routine.

  • For bytevector pointers (which should point at codevectors in the heap), strip the tag off the pointer, then add the offset to skip over the bytevector header (on the heap object itself). Finally, JUMP!

And, at long last, the actual result:
(gdb) r -heap sassy.heap
Starting program: /home/pnkfelix/larcenydev/trunk/larceny_src/twobit.bin -heap sassy.heap
(no debugging symbols found)...(no debugging symbols found)...(no debugging symbols found)...(no debugging symbols found)...LARCENY_ROOT not set; using current directory
Larceny v0.91 "Children's Ice Cream" (May 2 2006 10:41:54, precise:Linux:unified)


> (load "testsass.scm")

> (linked)

Program received signal SIGTRAP, Trace/breakpoint trap.
0x08342d49 in ..@6866.L2 ()
(gdb) stepi
0x08342d4c in ..@6866.L2 ()
(gdb) stepi
0x00767454 in ?? ()
(gdb) c
Continuing.
321

Did you see it??? Did you??? That 321 at the end, that was produced by code in the heap. (Well, the value was; the actual action of printing the value is still happening from code in the text segment).

Woo hoo!!!

Wednesday, February 01, 2006

on parallelizing a garbage collector

So here are the lessons I learned from attempting to parallelize a stop-and-copy garbage collector, as implemented in the Larceny runtime, using Cilk's primitives as the basis for the parallelization.

At a high level, based on Flood's paper on parallel collection on SMP machines, I decided to start with a naive recursive stop-and-copy collection algorithm, and parallelize that using Cilk's spawn and sync primitives. (The Flood paper points out the utility of a work-stealing approach to load-balancing, which the Cilk runtime gives you "for free.")

Of course, the devil is in the details. First of all, all I had to start with was an kernel collector that uses Cheney's algorithm, which is not trivial to parallelize (as far as I can tell). So I had to develop a recursive algorithm that behaved in a manner compatible with the behavior of the Cheney collector (this is not as trivial as it sounds; the devil's in the details).

Even after I developed the recursive algorithm, I discovered a new set back: the Cilk language, as currently documented, has a restriction that you can only call Cilk functions from other Cilk functions. That is, Cilk functions can call out to C functions, but not vice versa.

This immediately leads to a huge problem for Larceny, because the Scheme code is meant to be able to call into the garbage collector. So that means that the Scheme code needs to be output as Cilk code, not C code. And in fact, the entire runtime and compiler system needs to be shifted to use Cilk instead of C. ACK!

Luckily, the very newest version of Cilk has a beta interface for calling Cilk functions from C code. Unluckily, the interface is indeed beta, and there were some bugs that were showstoppers for me (in hindsight, there were workarounds for the bugs, but that's neither here nor there).

So, here's the important lessons from the project:

  1. Reverse engineering is hard. Even when you're going from a complex algorithm to a simple one, there's always details in "real systems" that get overlooked. (For me, it was particular invariants in terms of the Cheney algorithm deliberately not traversing certain structures at certain times, and it was overloading the generation bits to control this, even though it was still relevant even for non-generational collection.

  2. If you want your language extension/variant A to interoperate with language B, it needs to go in both directions. You may think it will suffice to have one layered on top of the other, but in the end, you will be wrong, and your users will hate you. You're better off designing that interoperability in from the beginning. (This is relevant for languages that seek to interoperate with the CLR or the JVM, in that your extension language should define classes that themselves can be invoked or extended by terms in C# or Java, respectively.

Tuesday, January 03, 2006

They ID'ed me 100%

Bourbon
Congratulations! You're 118 proof, with specific scores in beer (20) , wine (100), and liquor (78).
Screw all that namby-pamby chick stuff, you're going straight for the bottle and a shot glass! It'll take more than a few shots of Wild Turkey or 99 Bananas before you start seeing pink elephants. You know how to handle your alcohol, and yourself at parties.



My test tracked 4 variables How you compared to other people your age and gender:
free online datingfree online dating
You scored higher than 30% on proof
free online datingfree online dating
You scored higher than 50% on beer index
free online datingfree online dating
You scored higher than 84% on wine index
free online datingfree online dating
You scored higher than 66% on liquor index

Link: The Alcohol Knowledge Test written by hoppersplit on Ok Cupid, home of the 32-Type Dating Test

Followers