the informal ramblings of a formal language researcher

Sunday, November 13, 2005

Swapping Ctrl/CapsLock (PCs or Macs!)

Someone else cares about this, and he cares enough to catalogue how to do it on all sorts of machines!
Just go here.

(hmm. Actually, that was a total lie; the link above only treats windows and linux machines. Well, this guys says how to do it on 10.4 machines.)

Thursday, November 03, 2005

some windows stuff worth knowing.


  • This page tells you how to get free copies of the command line development tools that are including in Visual Studio. That's right, get yourself the C/C++ compiler and header files (as well as some batch scripts to set up your environment properly), all direct from Microsoft.


    • .NET Framework SDK (for the compiler and linker).

    • Platform SDK (for the header files).

    • As a side note to this, I had to learn about the call statement for DOS Batch scripts in order to learn how to make a batch file that would call each of the batch files in sequence, because each of the above SDK has a different batch file to set up the environment.


  • This page tells you how to edit the Windows Registry to change the behavior of the Caps Lock key. For a Emacs user like me, this was a crucial thing to learn, since the control key is really hard to get at on modern Windows laptops. Speaking of which . . .

  • This page tells you how to install Emacs on a Windows machine. It works well enough.

  • Cygwin. Nuff said.

Wednesday, August 31, 2005

GC'ing classes

.NET as far as I can tell, does not garbage collect unreachable code. At best, you can try to manually manage the memory associated with dynamically loaded code by loading into separate AppDomains that you unload by hand. I have not really experimented with this option.

I was discussing this problem with a friend, who asserted that Java has the same problem.

To prove him wrong, I wrote the following class. You run it on the command line, passing an numeric argument that indicates the number of distinct classes you want to load. Try it out with and without the -Xnoclassgc option in Java!

import java.lang.Integer;
import java.lang.ClassLoader;
import java.lang.Class;
import java.lang.ClassNotFoundException;

public class ClassSFS {

public static void println(String s) {
System.out.println(s);
}
public static void println() {
System.out.println();
}
public static void print(String s) {
System.out.print(s);
}
public static void main(String[] args) {
println("Hello World");
int num_classes = Integer.parseInt(args[0]);
initbytes();
for(int i = 0; i < num_classes; i++) {

Tbytes[ 48 + 5 ] = (byte) (0x61 + (i/100) % 26);
Tbytes[ 48 + 6 ] = (byte) (0x61 + (i/10) % 26);
Tbytes[ 48 + 7 ] = (byte) (0x61 + (i/1) % 26);

CLoader cl = new CLoader();
try {
Class c = cl.findClass("T");
Object x = c.newInstance();
System.out.println("ClassLoader "+i+", x:"+x);
} catch (ClassNotFoundException e) {
System.out.println("ClassLoader "+i+", ClassNotFound e:"+e);
} catch (InstantiationException e) {
System.out.println("ClassLoader "+i+", InstantiationException e:"+e);
} catch (IllegalAccessException e) {
System.out.println("ClassLoader "+i+", IllegalAccessException e:"+e);
}
}
}

static class CLoader extends ClassLoader {
CLoader() { super(); }
public Class findClass(String name) throws ClassNotFoundException {
return
super.defineClass(name,
Tbytes,
0,
Tbytes.length);
}
}


private static int[] iTbytes = {

0xca, 0xfe, 0xba, 0xbe, 0x00, 0x00, 0x00, 0x2e, 0x00, 0x20, 0x0a, 0x00, 0x0a, 0x00, 0x16, 0x07,
0x00, 0x17, 0x0a, 0x00, 0x02, 0x00, 0x16, 0x08, 0x00, 0x18, 0x0a, 0x00, 0x02, 0x00, 0x19, 0x09,
0x00, 0x09, 0x00, 0x1a, 0x0a, 0x00, 0x02, 0x00, 0x1b, 0x08, 0x00, 0x0b, 0x07, 0x00, 0x1c, 0x07,
/* f e e */
0x00, 0x1d, 0x01, 0x00, 0x03, 0x66, 0x65, 0x65, 0x01, 0x00, 0x12, 0x4c, 0x6a, 0x61, 0x76, 0x61,
0x2f, 0x6c, 0x61, 0x6e, 0x67, 0x2f, 0x53, 0x74, 0x72, 0x69, 0x6e, 0x67, 0x3b, 0x01, 0x00, 0x06,
0x3c, 0x69, 0x6e, 0x69, 0x74, 0x3e, 0x01, 0x00, 0x03, 0x28, 0x29, 0x56, 0x01, 0x00, 0x04, 0x43,
0x6f, 0x64, 0x65, 0x01, 0x00, 0x0f, 0x4c, 0x69, 0x6e, 0x65, 0x4e, 0x75, 0x6d, 0x62, 0x65, 0x72,
0x54, 0x61, 0x62, 0x6c, 0x65, 0x01, 0x00, 0x08, 0x74, 0x6f, 0x53, 0x74, 0x72, 0x69, 0x6e, 0x67,
0x01, 0x00, 0x14, 0x28, 0x29, 0x4c, 0x6a, 0x61, 0x76, 0x61, 0x2f, 0x6c, 0x61, 0x6e, 0x67, 0x2f,
0x53, 0x74, 0x72, 0x69, 0x6e, 0x67, 0x3b, 0x01, 0x00, 0x08, 0x3c, 0x63, 0x6c, 0x69, 0x6e, 0x69,
0x74, 0x3e, 0x01, 0x00, 0x0a, 0x53, 0x6f, 0x75, 0x72, 0x63, 0x65, 0x46, 0x69, 0x6c, 0x65, 0x01,
0x00, 0x06, 0x54, 0x2e, 0x6a, 0x61, 0x76, 0x61, 0x0c, 0x00, 0x0d, 0x00, 0x0e, 0x01, 0x00, 0x16,
0x6a, 0x61, 0x76, 0x61, 0x2f, 0x6c, 0x61, 0x6e, 0x67, 0x2f, 0x53, 0x74, 0x72, 0x69, 0x6e, 0x67,
0x42, 0x75, 0x66, 0x66, 0x65, 0x72, 0x01, 0x00, 0x03, 0x54, 0x3c, 0x3e, 0x0c, 0x00, 0x1e, 0x00,
0x1f, 0x0c, 0x00, 0x0b, 0x00, 0x0c, 0x0c, 0x00, 0x11, 0x00, 0x12, 0x01, 0x00, 0x01, 0x54, 0x01,
0x00, 0x10, 0x6a, 0x61, 0x76, 0x61, 0x2f, 0x6c, 0x61, 0x6e, 0x67, 0x2f, 0x4f, 0x62, 0x6a, 0x65,
0x63, 0x74, 0x01, 0x00, 0x06, 0x61, 0x70, 0x70, 0x65, 0x6e, 0x64, 0x01, 0x00, 0x2c, 0x28, 0x4c,
0x6a, 0x61, 0x76, 0x61, 0x2f, 0x6c, 0x61, 0x6e, 0x67, 0x2f, 0x53, 0x74, 0x72, 0x69, 0x6e, 0x67,
0x3b, 0x29, 0x4c, 0x6a, 0x61, 0x76, 0x61, 0x2f, 0x6c, 0x61, 0x6e, 0x67, 0x2f, 0x53, 0x74, 0x72,
0x69, 0x6e, 0x67, 0x42, 0x75, 0x66, 0x66, 0x65, 0x72, 0x3b, 0x00, 0x21, 0x00, 0x09, 0x00, 0x0a,
0x00, 0x00, 0x00, 0x01, 0x00, 0x0a, 0x00, 0x0b, 0x00, 0x0c, 0x00, 0x00, 0x00, 0x03, 0x00, 0x01,
0x00, 0x0d, 0x00, 0x0e, 0x00, 0x01, 0x00, 0x0f, 0x00, 0x00, 0x00, 0x1d, 0x00, 0x01, 0x00, 0x01,
0x00, 0x00, 0x00, 0x05, 0x2a, 0xb7, 0x00, 0x01, 0xb1, 0x00, 0x00, 0x00, 0x01, 0x00, 0x10, 0x00,
0x00, 0x00, 0x06, 0x00, 0x01, 0x00, 0x00, 0x00, 0x01, 0x00, 0x01, 0x00, 0x11, 0x00, 0x12, 0x00,
0x01, 0x00, 0x0f, 0x00, 0x00, 0x00, 0x2e, 0x00, 0x02, 0x00, 0x01, 0x00, 0x00, 0x00, 0x16, 0xbb,
0x00, 0x02, 0x59, 0xb7, 0x00, 0x03, 0x12, 0x04, 0xb6, 0x00, 0x05, 0xb2, 0x00, 0x06, 0xb6, 0x00,
0x05, 0xb6, 0x00, 0x07, 0xb0, 0x00, 0x00, 0x00, 0x01, 0x00, 0x10, 0x00, 0x00, 0x00, 0x06, 0x00,
0x01, 0x00, 0x00, 0x00, 0x04, 0x00, 0x08, 0x00, 0x13, 0x00, 0x0e, 0x00, 0x01, 0x00, 0x0f, 0x00,
0x00, 0x00, 0x1e, 0x00, 0x01, 0x00, 0x00, 0x00, 0x00, 0x00, 0x06, 0x12, 0x08, 0xb3, 0x00, 0x06,
0xb1, 0x00, 0x00, 0x00, 0x01, 0x00, 0x10, 0x00, 0x00, 0x00, 0x06, 0x00, 0x01, 0x00, 0x00, 0x00,
0x02, 0x00, 0x01, 0x00, 0x14, 0x00, 0x00, 0x00, 0x02, 0x00, 0x15

};

private static byte[] Tbytes;

public static void initbytes()
{
Tbytes = new byte[ iTbytes.length ];
for(int i = 0; i < iTbytes.length; i++) {
Tbytes[i] = (byte)(iTbytes[i]);
}

print(""+nybble2char((Tbytes[0]>>4) & 0xF));
print(""+nybble2char((Tbytes[0]>>0) & 0xF));
print(""+nybble2char((Tbytes[1]>>4) & 0xF));
print(""+nybble2char((Tbytes[1]>>0) & 0xF));
print(""+nybble2char((Tbytes[2]>>4) & 0xF));
print(""+nybble2char((Tbytes[2]>>0) & 0xF));
print(""+nybble2char((Tbytes[3]>>4) & 0xF));
print(""+nybble2char((Tbytes[3]>>0) & 0xF));
println();
}

private static char nybble2char(int b) {
switch (b) {
case 0xf: return 'f';
case 0xe: return 'e';
case 0xd: return 'd';
case 0xc: return 'c';
case 0xb: return 'b';
case 0xa: return 'a';
default: return (char) (b+'a');
}
}

public static int Tcounter = 0;
}


The iTbytes array was generated by compiling public class T { private static String fee = "fee"; public String toString() { return "T<>"+fee; }}, then loading the resulting class file into emacs, switching to hexl-mode, and doing some keyboard-macrology to convert it to something javac would accept.

Thursday, August 25, 2005

C#, pass-by-value, pass-by-reference

Here's a nice little snippet of C# for you.
using System;

namespace InterfaceSample {
public delegate void Changed();
interface IPoint {
int X { get; set; }
int Y { get; set; }
}

struct Point: IPoint {
private int xValue, yValue;
public int X { get { return xValue; } set { xValue = value; } }
public int Y { get { return yValue; } set { yValue = value; } }
}

public class EntryPoint {
public static int Main() {
String formatstr = " p1.X: {0}, p1.Y: {1}, p2.X: {2}, p2.Y: {3} ip1.X: {4}, ip1.Y: {5}, ip2.X: {6}, ip2.Y: {7}";

Point p1 = new Point();
p1.X = p1.Y = 42;
IPoint ip1 = p1;
Point p2 = (Point) ip1;
IPoint ip2 = ip1;

Console.WriteLine(formatstr, p1.X, p1.Y, p2.X, p2.Y, ip1.X, ip1.Y, ip2.X, ip2.Y);
p1.X = p1.Y = 21; Console.WriteLine("p1.X = p1.Y = 21;");
Console.WriteLine(formatstr, p1.X, p1.Y, p2.X, p2.Y, ip1.X, ip1.Y, ip2.X, ip2.Y);
ip1.X = ip1.Y = 84; Console.WriteLine("ip1.X = ip1.Y = 84;");
Console.WriteLine(formatstr, p1.X, p1.Y, p2.X, p2.Y, ip1.X, ip1.Y, ip2.X, ip2.Y);
return 0;
}
}
}
In C#, classes and interfaces can have properties, which have usage syntax similar to fields but semantics similar to methods. In the interface, you just declare the property name and whether it has getters/setters; in the class, you then define the behavior you want for the property.

In C#, you can declare struct types, which are value types in the language. This means that they are passed by value. However, they are not immutable.

In C#, a struct type can implement an interface. Ah, what fun.

Here's the output of the above program:
  p1.X: 42, p1.Y: 42, p2.X: 42, p2.Y: 42 ip1.X: 42, ip1.Y: 42, ip2.X: 42, ip2.Y: 42
p1.X = p1.Y = 21;
p1.X: 21, p1.Y: 21, p2.X: 42, p2.Y: 42 ip1.X: 42, ip1.Y: 42, ip2.X: 42, ip2.Y: 42
ip1.X = ip1.Y = 84;
p1.X: 21, p1.Y: 21, p2.X: 42, p2.Y: 42 ip1.X: 84, ip1.Y: 84, ip2.X: 84, ip2.Y: 84


The mutation to p's properties is not propagated over to ip, which looks odd to a Java programmer like me. This is because the assignment ip1 = p1 makes a copy of p1 when it "boxes" it into ip1.

And even odder, given the previous paragraph, the mutations to ip1 are carried over to ip2. Actually, this isn't so odd, since ip1 is an interface after all, that might ("must", considering boxing?) be implemented by an object, and therefore the assignment must copy a reference to the object.

Finally, you can copy from the interface back to a Point, but this requires a cast. This makes sense (see previous paragraph).

In the end, I don't think I have a big problem with value types. Its just the mutable value types that I get nervous about, because then you actually need to start thinking about the copy/reference semantics.

Wednesday, August 10, 2005

old stack based languages

We all know that I've been looking at Forth to digest its approach to allowing powerful compile-time constructions.

Fare, an LL-discuss member, mentioned POP-11 as an alternative to Forth (that might be even older). It seems to also have the ability to define compile-time programs; it remains to be seen how it compares to Forth (or Lisp/Scheme, for that matter).

heh


I just realized that my use of "We all know" up above is completely unfounded, because I didn't bother to blog during the month of July, which was when I was investigating Forth so heavily. I suppose I should finish the investigation (or at least both the books from the library) and write something up.

rules for intermediate representations

And the muse of compiler development did rise from her murky swamp, and did say unto the Larceny developers, "Thou shalt not convert your aye-arrh into object form, be it string, bytecode, or otherwise, until the last possible moment."

The Larceny developers did take exception to this rule, pleading "but we have chosen an invertible object form, from which the most exhalted client developer may extract the original structured aye-arrh. Its strings are formatted thusly, isomorphic to the structure of the input, and thus less painful for my mortal eyes to gaze upon than the radiance of the aye-arrh structure itself."

To this, the muse of compiler development rules responds, "verily, you might take such a path; but then you must also provide such inversion functions, and not place the onus of developing such functions upon the shoulders of the most exhalted client developer, who is already fed up with trying to make sense of your underspecified and confusingly named interfaces.

Here endeth the lesson.

Thursday, August 04, 2005

On Macros and JavaDot

Tonight I made a fun macro that tries to cut down on the verbosity when you refer to classes using Javadot; normally you have to explicit write out the full name with all the package prefixes. What I want is to introduce a nice shorthand, similar to the shorthand introduced by the import statement in Java.

Here is the macro:
;; (let/import ((X Y Z1 Z2 ...)) BODY ...) binds Y.Z1, Y.Z2, ... to
;; the expressions X.Y.Z1, X.Y.Z2, ..., and naturally generalizes to
;; more than one prefix X.
;; As one special case, if Zn is (), then that means import the constructor
;; X.Y. as the name Y. (note the period on the end).
(define-syntax let/import
(transformer (lambda (stx ren cmp)
(let ((bindings (cadr stx))
(body (cddr stx))
(construct-new-bindings
(lambda (binding)
(let* ((->s (lambda (x)
(if (null? x)
""
(symbol->string x))))
(prefix (->s (car binding)))
(middle (->s (cadr binding)))
(suffixes (map ->s (cddr binding)))
(s->/append (lambda l
(string->symbol (apply string-append l))))
(make-binding (lambda l
(list (apply s->/append l)
(symbol->javadot-symbol
(apply s->/append prefix "." l))))))
(map (lambda (suffix) (make-binding middle "." suffix))
suffixes)))))
`(,(ren 'let) (,@(apply append (map construct-new-bindings bindings)))
,@body)))))


This pretty much works.

However, using it seems to have exposed what I'd call a bug in how Common Larceny's macro expander interacts with Javadot.

Watch this:

> (let () System.Reflection.Emit.AssemblyBuilderAccess.RunAndSave$)
#<procedure of 0 arguments>

> (let () System.Reflection.Emit.AssemblyBuilderAccess.Run$)
#<procedure of 0 arguments>

> (let/import ((System.Reflection.Emit AssemblyBuilderAccess RunAndSave$)) AssemblyBuilderAccess.RunAndSave$)
#<procedure of 0 arguments>

> (let/import ((System.Reflection.Emit AssemblyBuilderAccess RunAndSave$)) System.Reflection.Emit.AssemblyBuilderAccess.Run$)
#<procedure of 0 arguments>

> (let/import ((System.Reflection.Emit AssemblyBuilderAccess RunAndSave$)) System.Reflection.Emit.AssemblyBuilderAccess.RunAndSave$)

Error: Reference to undefined global variable "system.reflection.emit.assemblybuilderaccess.runandsave$".

>


What huppen!?!

I dunno, but Ryan and I tried looking at the expanded output:

> (macro-expand '(let/import ((System.Reflection.Emit AssemblyBuilderAccess RunAndSave$)) System.Reflection.Emit.AssemblyBuilderAccess.Run$))
((lambda () ((lambda (.assemblybuilderaccess.runandsave$|4) (clr/find-static-field-getter '#f 'system.reflection.emit.assemblybuilderaccess.run)) (clr/find-static-field-getter '#f 'system.reflection.emit.assemblybuilderaccess.runandsave))))

> (macro-expand '(let/import ((System.Reflection.Emit AssemblyBuilderAccess RunAndSave$)) AssemblyBuilderAccess.RunAndSave$))
((lambda () ((lambda (.assemblybuilderaccess.runandsave$|4) .assemblybuilderaccess.runandsave$|4) (clr/find-static-field-getter '#f 'system.reflection.emit.assemblybuilderaccess.runandsave))))

> (macro-expand '(let/import ((System.Reflection.Emit AssemblyBuilderAccess RunAndSave$)) System.Reflection.Emit.AssemblyBuilderAccess.RunAndSave$))
((lambda () ((lambda (.assemblybuilderaccess.runandsave$|4) (clr/find-static-field-getter '#f 'system.reflection.emit.assemblybuilderaccess.runandsave)) system.reflection.emit.assemblybuilderaccess.runandsave$)))

>


Update


It turns out this isn't even a problem with Macros; it looks like its just a bug in our JavaDot implementation when you refer to the same identifier more than once. E.g.:
> (begin (display System.String.class) (display System.String.class))
#<System.RuntimeType System.String>
Error: Reference to undefined global variable "system.string.class".
So much for getting excited about some strange new bug...

Wednesday, August 03, 2005

Tales from the Larceny source: Eta the Ultimate


; Note: the idiom that is seen in this file,
; (emit-fixup-proc! as (lambda (b l) (fixup b l)))
; when `fixup' is a local procedure, avoids allocation of the closure
; except in the cases where the fixup is in fact needed, for gains in
; speed and reduction in allocation. (Ask me if you want numbers.)

I didn't understand this note in sparcasm.sch when I read it a week or so ago, so I asked Will to explain it to me. Here is what I got out of the explanation.

The code is something like:
(define (foo as x)
(define (fixup bv loc) --omitted--)
(if (usually-true)
(bar)
(emit-fixup-proc! as (lambda (b l) (fixup b l)))))
The idea here is that we know all the points where fixup is used, and it is being invoked at each one. Thus, fixup qualifies as a "known local procedure.", and we can compile it directly to a sequence of machine code instructions in the current text segment, and make it just another label in the compiled machine code for foo that we can jump to.

If we hadn't eta-expanded the reference to fixup, a la:
(define (foo as x)
(define (fixup bv loc) --omitted--)
(if (usually-true)
(bar)
(emit-fixup-proc! as fixup)))
then fixup is no longer a "known local procedure", because we cannot tell what emit-fixup-proc! will do with it (e.g. it may store it into a global variable which will later be invoked), and therefore in this latter case, we will conservatively generate a closure object and bind that to fixup.

In the former case, we also generate a closure object for the whole lambda expression, but we only generate this object on the uncommon control flow path; fixup itself is a free variable referenced by the closure, and therefore remains a simple label in the machine code text of foo.

This leads to a few questions, none of which I thought of in time to ask Will about them:
  1. Couldn't we have acheived the same effect (but without eta-expanding) by pushing the definition of fixup down closer to its use?
    • Quesswork answer: yes, but in the real source tree, fixup is actually referenced multiple times. Even then, one might be able to get away with cloning the definition and pushing the seperate clones down, but now you have to trade off the blow up in static code size versus the cost of allocating the closure at runtime.
  2. What about an optimization pass to automatically introduce eta expansions on references like these to otherwise known local procedures?
    • Quesswork answer: well, note here we rely on a human provided assurance as to what the common code path is. If it were more common to take the other route, then I don't think eta-expansion would be a win anymore. What kind of performance hit could such a transformation introduce then?
  3. What about optimizing whatever is introducing the code for the allocation of the closure for fixup, so that the introduced code doesn't allocate until it absolutely must (due to data-dependency requirements, either by a use of fixup or a mutation of some data that the construction of fixup requires)?
    • I dunno. Sounds tricky. Also note that this is pretty similar to (1.) above, except applied at a lower level than Scheme source.

Thursday, June 30, 2005

Wednesday, June 29, 2005

bonus (dfw) !

And as an added bonus, here is pretty good commencement speech by David Foster Wallace. Its no "Wear Sunscreen", in that I don't see it being put to music and put on VH1 anytime soon, but its probably bettter than "No Sex in the Champagne Room."

David Foster Wallace, in case you didn't know, is the author of Infinite Jest, a book which I'm still not sure whether was worth finishing (other than to be able to say at dinner, "I finished Infinite Jest.", which is a great test of the people you're talking to, because the right response to this statement really is "Why?").

He has other books as well ("Interviews with hideous men") but those didn't grab me (and drop me) in the same way that Infinite Jest did.

self and virtual types

I've been listening on discussions in the lab lately on handling self types. Sam mentioned something called virtual types. I know nothing about them.

However, I happened to run into an article by Remy and Vouillon arguing that virtual types (or at least common uses of virtual types) do not need special treatment in O'Caml, and gives some example code.

So I want to go back over the article later and really digest what is going on. I followed the argument as presented when I read it directly, but I'm feeling I may have missed the point when they were talking about camels at the end.

Monday, June 20, 2005

larceny unleashed

Last week, Will finally decided it was time to "release" Petit Larceny (as well as the latest build for Native Larceny).

I don't have much else to add to the subject. (other than its not really what anyone is willing to call a formal release; Will calls it a "Developer's Alpha", and I call it "Potentially Unembarressing")

However, it is nice that we've hit this stage.

Next items on my agenda:
  • Finish term paper on "Computability and Complexity of Type Inference" (this mostly consists of poking Carl with progressively larger sticks).
  • Finish getting NASM support working for Larceny on win32 platforms (so that we can release for windows as well as the current Mac OS X and Sparc).
  • Merge the Common Larceny and Petit Larceny (aka release_2) source branches in the CVS repository.
  • Start planning out how to implement the Garbage First collector in Larceny's runtime.

Thursday, June 16, 2005

garbage collection in cylone

I just read a nice lil' paper that gives me hope for my own research.

It sounds like they put a region system to good use in developing a simple two-space garbage collector.

However, this blog entry indicates that they haven't figured out how to do a generational collector yet, and that they think it might require dependent types. Yowza. I've been thinking about the problem as well, but I never got to the point where it seemed like something that sophisticated would be required. . .

Thursday, June 09, 2005

before you die, i'd like to. . .

Here is a page of links on PowerPC assembly programming.

Yeah, I've taken my sweet time getting around to learning ppc asm, so now one must wonder, "why bother?"

No answer.

apple, suicide, and analogies

This is a "fun" posting from 2001.

No, the analogy doesn't entirely stand up on its own. But its still worth considering.

The other issue that some people are talking about is DRM. Is that a driving issue? Its certainly not one that Steve Jobs was willing to mention on stage. Maybe its just paranoia.

Monday, June 06, 2005

"i for one welcome our new intel overlords"

. . . but not really.

Yep, Apple is going to be a Switcher. And I'm somewhat pissed about it.

I really had bought into the argument that RISC processors were the right solution for the long term, and that PowerPC was more scalable than Intel, etc etc. If nothing else, Intel processors meager register set really sucks.

Perhaps the worst thing is that it is going to be very hard to convince anyone to buy a new mac for the next year, at least. Sigh.

Thursday, May 26, 2005

The power set axiom and the specification schema

Some people at the lab know that I occasionally (half-jokingly) question the existence of the real numbers and the (informal) validity of a set of axioms that imply that the reals exist.

Most recently I had pinned this down to the powerset axiom in Zermelo set theory. The powerset axiom is: "For any set z, there is a set whose members are just the subsets of z." [ forall z exists y forall x (x in y iff forall w ( w in x implies w in z )) ].

Let me be clear here: I have no problem with the powerset operator being applied to finite sets. I definitely see that for any finite set z, the powerset of z is easily constructed. The questionable part is: why should I believe that the powerset operator is applicable to infinite sets? Given the set of natural numbers N, if we allow the existance of its powerset, 2^N, then there are elements of 2^N that cannot be described by a finite piece of text. Why should we assume that such sets exist?

I recently picked up Boolos' book, "Logic, Logic, and Logic", again, and started looking there for answers to questions I've been having recently about Linear Logic. The first chapter presents "The Iterative Conception of Set", where Boolos describes a staged model for constructing the sets described by Zermelo's axioms (which include the powerset axiom above).

There are a lot of rules for building up these stages, but most of them are straightforward and believable. I skimmed them over, said "okay", and moved on to the next section, which argues that the axioms of Zermelo Set theory follow from the rules given for building the stages. This includes the Powerset axiom; so I was naturally interested in reading this more carefully.

The derivation of the Powerset axiom hinges on the truth of the "specification axioms" of the staged model. When I saw that, I said, "okay, I missed the details of the specification axioms, lets revisit it. . ." It is actually a schema describing an infinite set of formulas that we take as axioms. It is phrased as follows:

"forall s exists y forall x (x in y iff [X] and exists t (t earlierThan s and x formedAt t)))"
where [X] is a formula of in which no occurrence of "y" is free.

As Boolos puts it, "any such axiom will say that for any stage there is a set of just those sets to which [X] applies that are formed before that stage. Let us call these axioms specification axioms."

To derive the powerset axiom, one starts with the specification axiom:
"forall s exists y forall x (x in y iff (forall w (w in x implies w in z) and exists t (t earlierThan s and x formedAt t)))"
which is the same as the above with [X] := "forall w (w in x implies w in z)". This specification axiom says "for any stage, there is a set of all subsets of z formed at earlier stages." (the argument continues from there, but I don't want to relay the remainder of it at this point.)

z is free in the above specification axioms. So it seems that we're actually going to be employing an infinite number of the specification axioms to argue that the powerset axiom is true, since the powerset axiom is quantified over all z; is this okay? Plus, I'm no longer so sure that it is sensible to put no constraints on [X] other than that no occurrence of "y" be free in [X].

But these are pretty flimsy questions; and that's a good thing for me, because it means that I'm narrowing down my uneasiness about the powerset axiom into more bite-size chunks that I can process.

Monday, May 23, 2005

Linear Logic (with urls!)

Girard, 1987. Very hard to read. Found on
Girard's pubs page. Girard, 1995 has a significantly lower barrier to entry.

Lafont's collection. The only one I've skimmed is a Linear Abstract Machine, which isn't even downloadable there.

Another collection

Wadler's collection.

Saturday, May 21, 2005

mailman, the return

As the subject indicates, I had a brief reencounter with a prior post today.

Short version of story (which is all I have time for today): when in doubt, check that the qrunner daemon has actually been started.

Tuesday, May 17, 2005

dlopen on win32 ?

I spent today getting Larceny to build on Cygwin. After I finished getting something that seemd to evaluate simple Scheme expressions, I tried dynamic loading. And it didn't work.

So at last I realized that I had been misinterpreting many of Lars' statements: dynamic loading doesn't work at all on Windows (even under Cygwin, which was the key bit I was missing).

Thus began a huge quest to figure out how dynamic loading works on Windows. Ha. Ha. Ha.

Here are some links I've found trying to explain it to Unix-heads like myself (or just looked interesting as I browsed over the Google output).

I think I need to pay special attention to the "Gimp" solution documented on the EDLL page; I don't mind writing a .def file that has all the millicode routines listed in it.

Thursday, May 12, 2005

linear objects

This is just a note to myself to revisit the
following article by Henry Baker:
'Use-Once' Variables and Linear Objects

I really should start making a big list for Stevie and I to go through in preparation for our PL Jr talk.

See also:

Friday, April 29, 2005

publication pressure

I have to say, I need to try to publish my work more often. (For other than
the obvious reason that I need to publish work, period.)

It really forces you to make everything concrete. To the max.

Almost as much as actually implementing it would.

Tuesday, April 19, 2005

System F and semi-unification

This is somewhat of a note to myself to revist the responses posted.

I'm not 100% percent sure the original author (Mike) really had a grasp on what he was saying. On the one hand, he wanted to get an intuition as to which programs would be problematic to infer types for. But then he started talking about reducing inference for those problems to semi-unification; that's not the right idea! The idea is that you take a bog-standard undecidable problem (e.g. the halting problem), reduce that to semi-unification, and then reduce the reduction to a sample term in System F.

Of course, this would probably yield a term so complex that it would boggle the mind, and not provide the intuition that he was hoping for.

If, on the other hand, he was just interested in the idea of using a (partial) semi-unifier to (partially) perform type inference for System F, then that's a different story. Hard to tell. But lots of interesting points and links in the responses.

subtextual response

I've been looking at Jonathan Edwards' experimental programming
environment/language, Subtextual

I am semi-underwhelmed. But there are some interesting connections with spreadsheet programming here.

I later read this post from the related Alarming Development blog, and there was one thing there that irk'ed me: the idea that we should embraced copy-and-paste programming.

cut-and-paste programming, I have learned to accept. That's when, if you are tempted to copy-and-paste a snippet of code, you instead cut it out entirely, and then turn it into a seperate subroutine/superclass/unit/module. Then you parameterize accordingly (and of course you are expected to add documentation about the mental model you have in your head about what this piece of code does).

But copy-and-paste is very dangerous. And in fact, in Subtextual, it seems like Edwards has embraced a sort of hybrid between cut and copy, because when you copy subtrees, you don't get a fresh copy immediately; instead you get a tree that is linked back to the original tree, and changes to either will propogate to both. You have a seperate operation, "introduce variant", to actually do the copying in place.

Tuesday, April 12, 2005

mailman, postfix, and pain

Was up late last night trying to set up a mailman-based mailing list to demonstrate such things to my cousin.

I made much use of this page to figure out how to get the whole thing going. But when I went to bed, the damn thing still wasn't working.

Eventually I figured out that I never actually started the mailman daemon that is in charge of processing information as it comes in. I just figured that apache and postfix would know that they needed to call out to these command line procedures when they got certain inputs (e.g. mail to a particular address). Silly me.

So I was dumped into the sewer of systems administration once again. But I've pulled myself out now; I'm so happy I'm a PL researcher.

Monday, April 11, 2005

Observing Finalizers and References on the JVM

As a followup to clarify my previous comment on this post (since Dave and I discussed this question personally):

Here is an experimental revision of the code that Dave posted; it is meant to demonstrate a number of things. The most important are:

  • If you want to really observe a space leak, it is helpful to actually use objects which are large (new int[X] is your friend here).
  • If you want to really observe a space leak, it (unfortunately) seems like you may have to employ the (non-standard) -Xms and -Xmx options to the JVM.
  • There are more reliable/sensible alternatives for monitoring whether an object has been collected than attempting to maintain a hashtable of object identifiers. Namely, use of classes in java.lang.ref.*.

    • Having said this, I admit that I do not fully understand the semantics of weak references, because I was not able to naively replace all of Dave's uses of kill(int) with my killAlt(int) procedure.

  • Finalizers really are run only once. However, that doesn't always mean that you can't collect an object just because the call to its finalizer makes it live again.


I had to backport the code to JDK 1.4 since that's all I have on my poor little Mac OS X box. Sorry. I tried to make the backport clean by using trampoline instantiation of the generic collections rather than cluttering up the code with casts everywhere.

Sample command line invocations of this class:

% java -Xmx1mb -Xms1mb Immortal -dummysize 100000 -num 10

% java -Xmx1mb -Xms1mb Immortal -dummysize 100000 -blowup 10

(Note that due to the braindead way I'm doing argument parsing, the order of the options above is VERY significant).


import java.util.Hashtable;
import java.util.Vector;
import java.lang.ref.Reference;
import java.lang.ref.WeakReference;
import java.lang.ref.ReferenceQueue;

public class Immortal {

static class HashtableFromIntegerToInteger {
Hashtable me = new Hashtable();
public int get(int k) { return ((Integer)me.get(new Integer(k))).intValue(); }
public void put(int k, int v) { me.put(new Integer(k), new Integer(v)); }
public void remove(int k) { me.remove(new Integer(k)); }
}

static class HashtableFromIntegerToImmortal {
Hashtable me = new Hashtable();
public Immortal get(int k) { return ((Immortal)me.get(new Integer(k))); }
public void put(int k, Immortal v) { me.put(new Integer(k), v); }
public void remove(int k) { me.remove(new Integer(k)); }
}

static class HashtableFromIntegerToRef {
Hashtable me = new Hashtable();
public Reference get(int k) { return ((Reference)me.get(new Integer(k))); }
public void put(int k, Reference v) { me.put(new Integer(k), v); }
public void remove(int k) { me.remove(new Integer(k)); }
}

/** tracks the number of times each finalizer is run */
// private static Hashtable finalizeCounts = new Hashtable();
private static HashtableFromIntegerToInteger finalizeCounts = new HashtableFromIntegerToInteger();

/** used to control object lifetimes */
private static HashtableFromIntegerToImmortal pointers = new HashtableFromIntegerToImmortal();

private static HashtableFromIntegerToRef phantoms = new HashtableFromIntegerToRef();

private static ReferenceQueue refqueue = new ReferenceQueue();

private static boolean nomesg = false;

private static void mesg(String s) {
if (! nomesg) {
System.out.println(s);
}
}

/** used to generate unique identifier codes */
private static int unique = 0;

/** some "large" state so that we can observe blowup */
public int[] dummydata;
private static int dummysize = 100;

public static void killAlt(int id) {
int finalizeCount = finalizeCounts.get(id);
pointers.remove(id);
while ( ! phantoms.get(id).isEnqueued())
{
mesg("(alt) trying to kill " + id + "...");
System.gc();
}
}

/** deletes the object with the given id. */
public static void kill(int id)
{
int finalizeCount = finalizeCounts.get(id);
pointers.remove(id);
while (finalizeCounts.get(id) == finalizeCount)
{
mesg("trying to kill " + id + "...");
System.gc();
}
}

// The code from these two methods can't be inlined, because
// we rely on their stack frame disappearing to prevent the
// link to the tracked object from persisting.

public static int makeTemporaryObject()
{
Immortal temp = new Immortal();
return temp.id;
}

public static void doSomethingWith(int id)
{
Immortal temp = pointers.get(id);
temp.sayHello();
}

/** identifier code */
private int id;

private Immortal()
{
id = unique++;
mesg("creating instance " + id);
finalizeCounts.put(id, 0);
pointers.put(id, this);
phantoms.put(id, new WeakReference(this, refqueue));
int last = dummysize;
dummydata = new int[last + 1];
dummydata[0] = 1;
dummydata[last] = 1;
}

public void sayHello()
{
System.out.println("hi, I'm number " + id);
}

public void finalize()
{
mesg("finalizing " + id + "...");
finalizeCounts.put(id, finalizeCounts.get(id) + 1);
// clear! *khzh-thump*
pointers.put(id, this);
}

public static void main(String[] args) {

if (args.length == 0) {
int id = Immortal.makeTemporaryObject();

// This causes the finalizer to run (in the GC thread.)
Immortal.kill(id);

// And yet, the object is still alive!
Immortal.doSomethingWith(id);

// This will now loop infinitely, since the finalizer
// will never be run a second time.
Immortal.kill(id);
} else {
for(int i = 0; i < args.length; i++) {
if (false) {
// dummy case for uniform syntax
} else if (args[i].equals("-nomesg")) {
nomesg = true;
} else if (args[i].equals("-dummysize")) {
int num = Integer.parseInt(args[++i]);
dummysize = num;
} else if (args[i].equals("-num")) {
int num = Integer.parseInt(args[++i]);

for(int j = 0; j < num; j++) {
int id = Immortal.makeTemporaryObject();

// This causes the finalizer to run (in the GC thread.)
Immortal.kill(id);

// And yet, the object is still alive!
Immortal.doSomethingWith(id);

// FSK: alternative variant of kill which uses weak references
Immortal.killAlt(id);
}
} else if (args[i].equals("-blowup")) {
int num = Integer.parseInt(args[++i]);

for(int j = 0; j < num; j++) {
int id = Immortal.makeTemporaryObject();

// This causes the finalizer to run (in the GC thread.)
Immortal.kill(id);

// And yet, the object is still alive!
Immortal.doSomethingWith(id);

}
}
}
}
}
}

Saturday, March 26, 2005

"Well-going programs can be typed"

Here is one of the stranger papers I've seen since the Forsythe one:

Well-going programs can be typed

The abstract:

Our idiomatically objectionable title is a pun on Milner's ``well-typed programs do not go wrong' --- because we provide a completeness result for type-checking rather than a soundness result.

We show that the well-behaved functions of untyped PCF are already expressible in typed PCF: any equivalence class of the partial logical equivalence generated from the flat natural numbers in the model given by PCF's operational semantics is inhabited by a well-typed term.


Its got Scott's $D_{\infty}$, PCF, Retractions, PERs, Logical Relations, . . .

And yet the paper itself is 50% tricks with Haskell type classes and GHC extensions.

I think the title is extremely misleading, at least based on my interpretation of the abstract. I thought the title meant: "if your untyped PCF program is well-behaved, then there is some way to add type annotations to yield a well-typed program (presumably with the same behavior)." But abstract semes to really be saying that if you have a well-behaved function, then there is some well-typed function with extensionally equivalent behavior; note the lack of constraint on the form of the typed program text and its relation to the untyped text.

test

My first post. Woot.

Catching up on West Wing. This one has Castro. Lots of Cuban-sounding music in the background. Its kind of like watching The Godfather Part II

Followers