pnkfif

Sunday, November 06, 2011

Middle Squares and Big Nums

No screen shots tonight. But on the semi-bright side, at least this time I am posting something that has to do with TAOCP.

For my second quick experiment to play with the Flash tools, I decided to make a swf that would illustrate the "middle-square" method for pseudo-random sequences of numbers (aka a pseudo-random number generator, or PRNG).

The technique is simple, and aptly named at least if you are used to reading your function compositions as (f ○ g) rather than as the categorical style of (g ; f). Every time you want a new "random" number of let's say 10 digits, you take the previous number N (or the initial seed if you're just starting out), square it (which will yield of number of at most 20 digits, and in fact we'll pad N² out with zeroes so that it is always 20 digits), and then extract the middle 10 digits from that 20 digit N².

It shouldn't take very much code to implement something like this. Except (a ha), a lot of "modern" languages like ActionScript do not have built-in support for arbitrary precision arithmetic. In the particular case of the ECMAscript family, as soon as you get above a certain threshold (usually around 2³⁰ or so, they fall into floating-point mode. That might suffice for a lot of the scripts out there, but it falls down miserably in this case because the majority of the ten digit numbers are obviously larger than 5⋅10⁹ = 5,000,000,000 which is itself larger than 4,294,967,296 = 2³².

So as soon as you want to actually show off Knuth's first example: 5,772,156,649, you discover that you can't use the language's built-in arithmetic to do it.

(And for those of you saying "but it is safe to use floating point to store those ten digit numbers; you get 53 bits of precision in the significand in IEEE 754 floating point": I recommend you go back up and re-read the very first step (or second-word in our ○-composed name). And/or double-check what 2⁵³ is in base-10.

Anyway, that was just a long explanation for why I had to make a quick BigNum class in order to do this task. I wasn't expecting to have to get into BigNum work until I hit Chapter 4 of TAOCP, and I plan to keep the class as minimal as possible (in both performance and features) until I get there.

(I used Emacs htmlize.el to format this code. If I knew how to integrate the CSS it generates into blogspot, I'd have font colors too. But that will have to be a project for another day.)

`BigNum.as` code

package  {
    public class BigNum {
        internal static const bigitLen = 16;
        internal static const base = 10; // Math.pow(2, bigitLen);

        // a base of 10 is *incredibly* inefficient, but it does allow
        // for a very simple toString implementation (and parse as well
        // but lets only put that in if I need to debug)

        public function toString():String {
            // This code *definitely* only works for base == 10

            if (base == 10) {
                var i;
                var r = "";
                var maxNonZeroIdx = 0;
                for (i=0; i < bigits.length; i++) {
                    if (bigits[i] > 0) {
                        maxNonZeroIdx = i;
                    }
                }
                for (i=0; i <= maxNonZeroIdx; i++) {
                    r = bigits[i] + r;
                }
                return r;
            } else {
                throw "unimplemented";
            }
        }

        // AF(this) = (nonNeg?1:-1) * Sum_i{ bigits[i] * base^i }
        private var nonNeg:Boolean;
        private var bigits:Vector.<uint>;

        public function BigNum(num:*=0) {
            nonNeg = true;
            bigits = new Vector.<uint>(1);
            if (num is uint && num < base) {
                bigits[0] = num;
            } else {
                this.parse(String(num));
            }
        }

        // modifies: this
        // effect: sets up this to match input (base 10).
        private function parse(input:String):void
        {
            if (false && base == 10) {
                parse_base10(input);
            } else {
                parse_general(input);
            }
        }

        // modifies: this
        // effect: same as parse(input)
        private function parse_base10(input:String):void
        {
            // This code *definitely* only works for base == 10

            bigits.length = input.length;
            for (var i=0; i < input.length; i++) {
                bigits[input.length - i - 1] = uint(input.charAt(i));
            }
        }

        // modifies: this
        // effect: same as parse(input)
        private function parse_general(input:String):void
        {
            // This code probably works for bases other than 10

            for (var i=0; i < input.length; i++) {
                var amount = uint(input.charAt(i));
                this.mulAndAddInPlace(10, amount);
            }
        }

        // modifies: this
        // effect: this := this*'scalar' + 'carry'
        private function mulAndAddInPlace(scalar:int, carry:int):void
        {
            for (var i=0; i < bigits.length; i++) {
                var a = bigits[i];
                a *= scalar;
                a += carry;
                bigits[i] = a % base;
                carry = (a - (a % base)) / base;
            }
            if (carry > 0) {
                bigits[bigits.length] = carry;
            }
        }

        // modifies: this
        // effect: this := this * 'base'
        private function leftShiftInPlace(amt:uint):void
        {
            for (var i=0; i<amt; i++) {
                bigits.unshift(0);
            }
        }

        private function bigit(idx:uint):uint
        {
            return (idx < bigits.length) ? bigits[idx] : 0;
        }

        public function add(that:BigNum):BigNum {
            // This code may or may not work for bases other than 10

            var r:BigNum = new BigNum;
            var len = Math.max(this.bigits.length, that.bigits.length);
            var carry = 0;
            for (var i=0; i<len; i++) {
                var b = this.bigit(i) + that.bigit(i) + carry;
                r.bigits[i] = b % base;
                carry = (b - (b % base)) / base;
            }
            if (carry > 0) {
                r.bigits[len] = carry;
            }
            return r;
        }

        private function clone():BigNum
        {
            return this.add(new BigNum);
        }

        public function mul(that:BigNum):BigNum {
            // This code may or may not work for bases other than 10

            // for now, multiply the same way you'd do it by hand:
            // by repeated multiply-by-scalar, shift, and sum.
            var accum:BigNum = new BigNum;
            for (var i=0; i<that.bigits.length; i++) {
                var scaled:BigNum = this.clone();
                scaled.mulAndAddInPlace(that.bigits[i], 0);
                scaled.leftShiftInPlace(i);
                accum = accum.add(scaled);
            }
            return accum;
        }
    }

}

Anyway, after the BigNum class done, I whipped up a Flash program that let me interactively input new seeds for such sequences. The first version was somewhat embarrassing, because I insisted on doing everything within the Actions block for the single frame of the SWF movie, rather than using any of the Flash Pro GUI to drag around UI elements.

But at least I had the good sense to factor out the middle square generation code. And then after I had that working acceptably, I was able to go back and experiment with using the Flash Pro GUI to use the built-in UI components more effectively (e.g. so that I could create a scrollable list for the output sequence rather than a non-scrollable label, and so that I could control the number of elements created in the sequence).

It is still not perfect; e.g. it would be better if the sequence were created on the fly in response to the user scrolling down the list. Then there would be no need to impose an a priori limit on the sequence length. But that would require a little bit more revision to the design than I want to undertake at the moment.

Anyway, here are the two versions, presented in three parts: first the middle square generation routine on its own, then the first ugly UI code, and then the better code that presumes that one has used Flash Pro to set up UI components in a sane manner.

Middle Square Generation

function middleSquareSequence(input:String, iters:uint):Vector.<String>
{
    var r:Vector.<String> = new Vector.<String>();
    if (isNaN(Number(input))) {
        r.push(input);
    } else {
        // Illustrate middle-square method for (bad) PRNG
        var digits:uint = input.length;

        // Make digit count even, to simplify definitionn of "middle digits"
        if ((digits % 2) == 1)
            digits = digits + 1;
        var dig2 = uint(digits/2);

        var bn:BigNum = new BigNum(input);
        var s:String = bn.toString();
        while (s.length < digits) {
            s = "0" + s;
        }
        r.push(s);

        for (var i=0; i < iters; i++) {
            var nsqr = bn.mul(bn); // this is <= 2*digits;
            // want the middle digits.
            // Ideally I'd do this via arithmetic, but I don't feel like
            // adding support for such to my hacked up BigNum class at the
            // moment.
            s = nsqr.toString();
            while (s.length < 2*digits) {
                s = "0"+s;
            }
            var sub:String = s.substring(dig2,2*digits-dig2);
            r.push(sub);
            bn = new BigNum(sub);
        }
    }
    return r;
}

Ugly Hacked Up UI Code

import fl.controls.Button;
import fl.controls.TextInput;
import fl.controls.List;
import fl.controls.Label;
import flash.events.MouseEvent;
import fl.controls.NumericStepper;

// ***************************************************************************
// Old hand written code below that did everything without using Flash Pro GUI
// (except for importing components into the SWF).  I.E., this code manually
// creates and positions the UI elements.  Its not really worth the pain for
// one-off scripts like this, especially now that I know how to get
// code-hinting for UI componets by explicitly redeclaring the variables.
// ***************************************************************************

if (true) {
    var b:Button;
    b = new Button;

    // Mutating b/textField doesn't seem to have
    // desirable effects.  The documentation just
    // says its a reference to the Button's internal
    // textfield, which makes me think that this never
    // should have been provided as a mutable property.
    // b.textField = t;

    b.label = "Click"
    // b.setStyle("textFormat", tf);
    // b.width = 400;
    // b.height = 300;
    b.x = 120;
    b.y = 10;
    addChild(b);

    var inp:TextInput;
    inp = new TextInput;
    inp.x = 120;
    inp.y = 40;
    inp.text = "5772156649";
    addChild(inp);

    b.addEventListener(MouseEvent.CLICK,
                       function() {
                           var seq:Vector.<String>;
                           seq = middleSquareSequence(inp.text, 400);
                           l.text = "";
                           for (var i=0; i<seq.length; i++) {
                               l.text += seq[i] + "\n";
                           }
                       });

    var l:Label;
    l = new Label;
    l.x = 220;
    l.y = 10;
    l.height = 400;
    if (true) {
        l.text = "Hi\nworld";
    } else if (false) {
        // (These are some tests of BigNum functionality
        //  I put in as I was getting this going.)
        l.text = (""+new BigNum("1234567890")
                  +"\n" + new BigNum("1234").add(new BigNum("1234"))
                  +"\n" + new BigNum("1234").add(new BigNum("1766"))
                  +"\n" + new BigNum("9999").add(new BigNum("1001"))
                  +"\n" + new BigNum("1000").mul(new BigNum("1000"))
                 );
    }
    l.selectable = true;
    addChild(l);
}

Nicer looking (and shorter) UI code presuming Flash Pro

import fl.controls.Button;
import fl.controls.TextInput;
import fl.controls.List;
import fl.controls.Label;
import flash.events.MouseEvent;
import fl.controls.NumericStepper;

// These are local variable declarations for the UI elements that I added
// manually in Flash Pro.  Apparently the tool is too stupid to know to provide
// code hinting unless I add these declarations (along with the above imports)
// myself.  (Or maybe this is a "feature" rather than a "bug")
var reseedButton:Button;
var seedInput:TextInput;
var outputList:List;
var numItersStepper:NumericStepper;

reseedButton.addEventListener(MouseEvent.CLICK,
  function () {
      outputList.removeAll();
      var seq : Vector.<String>;
      seq = middleSquareSequence(seedInput.text, numItersStepper.value);
      var seen = {};
      var seenRepeat:Boolean = false;
      for (var i=0; i<seq.length; i++) {
          if (seenRepeat || seq[i] in seen) {
              seenRepeat = true;
              outputList.addItem({label:(seq[i]+
                                         " (repeat)")});
          } else {
              outputList.addItem({label:seq[i]});
              seen[seq[i]] = true;
          }
      }
  });

Saturday, November 05, 2011

A Bit of Trignometry

So I wanted to aim small for my initial foray into using Flash Professional to make a small bit of example code for this blog.

(I have done some small proof of concept stuff in Flash Pro and in Flash Builder before, mainly to explore the runtime garbage collector or illustrate short-comings in our ActionScript API's. But none of that has translated over to blog content here before.)

I figure that in general I'll need to be able to render animated text and shapes. Other functionality (mouse event handling, UI elements like buttons or text boxes) can wait for later.

So okay, how about "Hello World," where the text does something interesting. Like rotate 360 degrees in a tight circle around its mid-point; that's the kind of thing that is really easy to show off in DrRacket.

It was easy to get started: a few mouse clicks to select the text tool, create a "Hello World" text object, and make it rotate continuously.

Choose the menu item "File → New..." and choose "ActionScript 3.0" in the New Document dialog window.
Now that you have a fresh blank slate to work with, select the Text tool in the toolbar on the right hand side of the screen:
Create a fresh text object by clicking the Text tool on the white space (the "stage") on the left hand side of the screen:
Create a fresh text object by clicking the Text tool on the white space (the "stage") on the left hand side of the screen:
If you like, you can automatically align the text to sit in the center of the stage, in four clicks: the first to open up the alignment controls, by clicking the "Align" button in the menu bar at the center of the screen, the second to turn on the "Align to stage" option in the alignment controls, and the third and fourth to turn on the "Align vertical center" and "Align horizontal center" options. The latter three steps are illustrated below:
Finally, to make the selected text actually act in a dynamic fashion, double-click one of the premade ActionScript snippets, found in the "Code Snippets" that you can select from that middle menu bar again. For this experiment, I selected the "Rotate Continously" code snippet:
You will probably get a pop-up dialog box saying that this code snippet requires an instance name; just click okay and Flash Pro makes up a name for the text object. And voila:

Except when I tested this (menu item "Control→Test Movie→Test"), I did not get what I expected:

Yes, this text is rotating in a circle... but not around a point in the middle of the text.

The reason is simple: the text object's position is determined relative to the (x,y) coordinate at the upper-left hand corner of the text. When you rotate the text, it rotates around that origin at its upper-left corner, so that corner is what it rotates around.

This discovery turned a somewhat silly test of Flash into a slightly more meaty (though mechanical) puzzle: how do I get the effect I wanted originally.

The answer: Trigonometry! As the rectangle rotates, we also adjust the position of its upper-left corner, moving it in a circle in synchronization with the rotation of the rectange.

(The above picture has flaws; in particular, the line segment r is supposed to end at the center point of the rectangle, but it clearly does not, since the horizontal line going through it is vertically above both the lower-left and the upper-right corners. This should not be possible if that line were truly going through the center of the rectangle. In my defense, this was my first foray into using Adobe Illustrator. I will probably try again later using TikZ/PGF.)

This derivation leads to the following change to the ActionScript code that Flash Pro inserted for us:

textField_1.addEventListener(Event.ENTER_FRAME, fl_RotateContinuously);

var w = this.width;
var h = this.height;
var r = Math.sqrt(w*w/4+h*h/4);
var theta = Math.asin(h/(2*r));
var origin = {x:stage.width/2, y:stage.height/2};

function fl_RotateContinuously(event:Event)
{
   alpha += 0.1;
   var psi = alpha + theta;
   var dx = r * Math.cos(psi);
   var dy = r * Math.sin(psi);
   // trace(w,h,dx,dy);
   textField_1.x = origin.x - dx;
   textField_1.y = origin.y - dy;
   textField_1.rotation = 360*alpha/(2*Math.PI);
}

And, voila, for real this time:

Saturday, October 29, 2011

Bonjour de Paris

I have arrived in Paris! It is great. Stephanie has been posting pictures on her blog.

I had to leave the majority of my library at home, but I did take a special set of textbooks with me: my old editions of TAOCP, all three volumes. I am hoping to work through volumes 2 and 3, and post the results of my experimental implementations here, the same way I posted the code for algorithm 7.2.1.1-W earlier. I have many more languages available at my fingertips now: in addition to the old-standbys, Java, C, and Scheme, I now also have ActionScript, Javascript, and C++. Imight also dabble in a bit of assembly code in x86_64 or ARM from time to time.

At the same time, I am experimenting with alternative tools for authoring blog content. I'm trying to be a good little Adobe employee and try to use our own tools (Dreamweaver in this case) to interface with Blogger. My end goal is to post active Javascript and Flash content on the web, in addtion to the transcripts I post. But that might require I shift over to my personal web server, rather than work through Blogger. We shall see.

Thursday, August 27, 2009

from 90sec to 9sec.

I have been reading Knuth's pre-fascicles from Chapter 7 of TAOCP.

Last night when I transcribed one of the algorithms (7.2.1.1-W) to Scheme code, I got caught up in Knuth's claim that the iterative process described should take "at most a few seconds", yet my first version was taking 90 seconds to run in Larceny. So I started the human iterative process of revising the code, inspecting the compiler output, and doing more runs to figure out where the time was going.

At the end of it, I got the running time down to 9 seconds (from 90; that's a 10x speed-up!) You can shave off a few more seconds by switching to unsafe mode via compiler switches, but it seems like the really important thing was understanding which operations were going to turn into native machine instructions, which ones would turn into millicode invocations in the C runtime, and which ones would turn into Scheme procedure invocations.

(I think 9 qualifies as "a few seconds", especially for a dynamic language like Scheme.)

So here, for your viewing pleasure, is both the first and last versions of the code, along with the log in the comments of the changes I made and what effect they seemed to have.


;; ----------------------------------------------------------------------
;; What is largest set of five letter words connected via single letter 
;; swaps?  To find out, we use specialized representation of sgb-words,
;; where each word is mapped to an index in a bit-table with 2^25 entries
;; (I would represent each bit by a byte for ease of coding, but Larceny
;;  cannot construct objects >= 16 MB, so my hand is forced.)

(define bytevector-bit-ref 
  (lambda (bv k)
    (let ((q (quotient k 8))
          (r (remainder k 8)))
      (fxarithmetic-shift-right
       (fxand (bytevector-u8-ref bv q)
              (fxarithmetic-shift-left 1 r)) r))))

(define bytevector-bit-ref? 
  (lambda (bv k)
    (not (zero? (bytevector-bit-ref bv k)))))

(define bytevector-bit-set! 
  (lambda (bv k bit)
    (let ((q (quotient k 8))
          (r (remainder k 8)))
      (bytevector-u8-set! 
       bv q 
       (fxior (fxand (bytevector-u8-ref bv q)
                     (fxxor #b11111111 (fxarithmetic-shift-left 1 r)))
              (fxarithmetic-shift-left bit r))))))

(define (word->index w)
  (let ((char->num (lambda (c) (- (char->integer c) (char->integer #\a))))
        (offsets (map (lambda (e) (expt 2 e)) '(20 15 10 5 0))))
    (apply + (map * (map char->num (string->list w)) 
                  offsets))))

(define (index->word i)
  (let ((num->char (lambda (n) (integer->char (+ n (char->integer #\a)))))
        (offsets (map (lambda (e) (expt 2 e)) '(20 15 10 5 0))))
    (list->string (map num->char (map (lambda (o) (remainder (quotient i o) (expt 2 5)))
                                      offsets)))))

(define (read-sgb-words file)
  (call-with-input-file file
    (lambda (p)
      (let loop ((words '()))
        (let ((x (read-line p)))
          (cond 
           ((eof-object? x)
            (reverse words))
           (else 
            (loop (cons x words)))))))))

(define (make-sgb-words-table words)
  (let ((table (make-bytevector (quotient (+ (expt 2 25) 7) 8))))
    (for-each (lambda (x) (bytevector-bit-set! table (word->index x) 1))
              words)
    table))

(define words (read-sgb-words "sgb-words.txt"))
(define t (make-sgb-words-table words))

;; This procedure takes 96sec on Chimaera.
;; Knuth says this process should finish in "at most a few seconds"; see below.
(define (find-max-intermix t words)
  (let ((A-best '())
        (l-best 0)
        (wiz (map word->index words))
        (two^25 (expt 2 25))
        (two^20 (expt 2 20))
        (two^15 (expt 2 15))
        (two^10 (expt 2 10))
        (two^5  (expt 2  5)))
    (let loop1 ((w1 wiz))
      (cond 
       ((not (null? w1))
        (let loop2 ((w2 (cdr w1)))
          (cond 
           ((null? w2)
            (loop1 (cdr w1)))
           (else 
            (let* ((w (car w1))
                   (w* (car w2))
                   (z (fxxor w w*))
                   (m (+ two^20 two^15 two^10 two^5 1))
                   (m* (* two^5 m)))
              (cond
               ((not (zero? (fxand (fxxor (fx- z m) z m) m*)))
                (loop2 (cdr w2)))
               (else 
                (let ((m (vector (fxand z (fx- two^5  1))      ; m_0 
                                 (fxand z (fx- two^10 two^5))  ; m_1
                                 (fxand z (fx- two^15 two^10)) ; m_2
                                 (fxand z (fx- two^20 two^15)) ; m_3
                                 (fxand z (fx- two^25 two^20)) ; m_4
                                 )))
                  (let ((l 1) (A (list w)))
                    (let* ((sw (lambda (j) 
                                 (set! w (fxxor w (vector-ref m j)))
                                 (cond ((bytevector-bit-ref? t w)
                                        (set! l (+ l 1))
                                        (set! A (cons w A))))))
                           (swap-0 (lambda () (sw 0)))
                           (swap-1 (lambda () (swap-0) (sw 1) (swap-0)))
                           (swap-2 (lambda () (swap-1) (sw 2) (swap-1)))
                           (swap-3 (lambda () (swap-2) (sw 3) (swap-2)))
                           (swap-4 (lambda () (swap-3) (sw 4) (swap-3))))
                      (swap-4)
                      (cond ((> l l-best)
                             (display A) (newline)
                             (set! l-best l)
                             (set! A-best A))))))
                (loop2 (cdr w2)))))))))))))


;; Knuth says this process should finish in "at most a few seconds" but its 
;; taking Chimaera 96.4sec; lets see if allowing more inlining helps matters...
;; - Using internal define for bytevector-bit-ref and bytevector-bit-ref?
;;   shaved off 6 seconds to 90.1sec
;; - Using macros to define { sw, swap-0, swap-1, swap-2, swap-3, swap-4 }
;;   brings time down to 77.7sec
;; - Using macros to define bytevector-bit-ref and bytevector-bit-ref? 
;;   did not improve the time at all... (but I did not do it properly the first time!)
;; - Lifting the definitions of m and m* did not improve the time.
;; - Replacing the encoding of m_i via a 5-elem vector with explicit
;;   variables { m_0 .. m_4 } may have shaved a fraction of a second off
;;   the time...
;; - Fixing the macro for bytevector-bit-ref and bytevector-bit-ref? did not help.
;; - Replacing fxarithmetic-shift-left and fxarithmetic-shift-right with fxlsh and fxrsh
;;   was huge; brought time down to 40.5sec
;; - Changing encoding of 2^k to thunks rather than let-bound constants did not help anything.
;; - Changing encoding of 2^k to macros rather than thunks did not help anything [[ but I may have been timing the wrong code here ]]
;; - Changing encoding of m and m' to macros rather than let-bound constants did not help anything.
;; 
;;   [[ note on last three items: it looks strongly like Twobit is
;;      reintroducing the let-bindings for temporaries rather than
;;      inline their values in the code, which is interesting.]]
;;   [[ (are fxlsh and friends not interpreted at compile time?) ]]

;; - Manually constant folding fxlsh and friends got down to 39.8sec (maybe not statistically sig delta).
;; - Adding check for bytevector? of t at outset got down to 39.1sec (again maybe not stat sig on own).
;; - Switching to multi-valued fxdiv-and-mod was a terrible idea (x2.5 slowdown)
;; - And even using fxdiv fxmod separately was a terrible idea (x2 slowdown)

;;   [[ We apparently do not have any compiler optimizations for fxquotient and similar; hmm. ]]

;; - Changing the code to use fxrhsl and fxand instead of quotient and remainder
;;   brings running time down to 9sec (from between 37 and 39 seconds; another huge leap!).
 
(define (find-max-intermix.v16 t words)

  (let*-syntax ((bytevector-bit-ref 
                 (syntax-rules () 
                   ((_ bv k)
                    (let ((q (fxrshl k 3))
                          (r (fxand  k #b111)))
                      (fxrsha
                       (fxand (bytevector-u8-ref bv q)
                              (fxlsh 1 r)) r)))))

               (bytevector-bit-ref? 
                (syntax-rules () 
                  ((_ bv k)
                   (not (zero? (bytevector-bit-ref bv k))))))

               (two^25 (syntax-rules () ((_) 33554432)))
               (two^20 (syntax-rules () ((_) 1048576)))
               (two^15 (syntax-rules () ((_) 32768)))
               (two^10 (syntax-rules () ((_) 1024)))
               (two^5  (syntax-rules () ((_) 32)))
               (m      (syntax-rules () ((_) 1082401 )))   ;; (+ (two^20) (two^15) (two^10) (two^5) 1) 
               (m*     (syntax-rules () ((_) 34636832 ))) ;; (* (two^5) (m)))))
               )

 (let* ((A-best '())
        (l-best 0)
        (wiz (map word->index words)))
   (cond 
    ((not (bytevector? t)) (error))
    (else 
    (let loop1 ((w1 wiz))
      (cond 
       ((not (null? w1))
        (let loop2 ((w2 (cdr w1)))
          (cond 
           ((null? w2)
            (loop1 (cdr w1)))
           (else 
            (let* ((w (car w1))
                   (w* (car w2))
                   (z (fxxor w w*)))
              (cond
               ((not (fxzero? (fxand (fxxor (fx- z (m)) z (m)) (m*))))
                (loop2 (cdr w2)))
               (else 
                (let ((m_0 (fxand z (fx- (two^5)  1)))
                      (m_1 (fxand z (fx- (two^10) (two^5))))
                      (m_2 (fxand z (fx- (two^15) (two^10))))
                      (m_3 (fxand z (fx- (two^20) (two^15))))
                      (m_4 (fxand z (fx- (two^25) (two^20)))))
                  (let ((l 1) (A (list w)))
                    (let*-syntax ((upd (syntax-rules ()
                                         ((upd) (cond ((bytevector-bit-ref? t w)
                                                       (set! l (+ l 1))
                                                       (set! A (cons w A)))))))
                                  (sw (syntax-rules ()
                                        ((sw 0) (begin (set! w (fxxor w m_0)) (upd)))
                                        ((sw 1) (begin (set! w (fxxor w m_1)) (upd)))
                                        ((sw 2) (begin (set! w (fxxor w m_2)) (upd)))
                                        ((sw 3) (begin (set! w (fxxor w m_3)) (upd)))
                                        ((sw 4) (begin (set! w (fxxor w m_4)) (upd)))))
                                  (swap-0 (syntax-rules () 
                                            ((_) (sw 0))))
                                  (swap-1 (syntax-rules ()
                                            ((_) (begin (swap-0) (sw 1) (swap-0)))))
                                  (swap-2 (syntax-rules () 
                                            ((_) (begin (swap-1) (sw 2) (swap-1)))))
                                  (swap-3 (syntax-rules () 
                                            ((_) (begin (swap-2) (sw 3) (swap-2)))))
                                  (swap-4 (syntax-rules () 
                                            ((_) (begin (swap-3) (sw 4) (swap-3))))))
                     (begin
                       (swap-4)
                       (cond ((> l l-best)
                              (display A) (newline)
                              (set! l-best l)
                              (set! A-best A)))))))
                (loop2 (cdr w2))))))))))))))))

UPDATE: I realized it might be worthwhile to identify how much speed-up would have resulted if I had just applied the most important steps to the original version at the outset.

The answer is that just:

Turning the bit-referencing operations into internal defines within the procedure, and

Using the fxlsh, fxrshl, and fxand operations for the definitions of the bit-referencing operations

brings the running time down to 18 seconds. So somewhere around 70 seconds of the time was spent on inefficiencies due to using general purpose arithmetic operations when fixnum operations (that the compiler specializes) would probably suffice for most people; no need to use macros to force an inlined swap(4) operation (which is what Knuth recommends in his text).

Here is the 18 second version. Note that its code bytevector after being compiled by Twobit is 4.0 kilobytes, while the more heavily optimized find-max-intermix.v16 is 17.8 kilobytes; that is a 4x space regression for a 2x speed improvement.


(define (find-max-intermix t words)

  (define bytevector-bit-ref 
    (lambda (bv k)
      (let ((q (fxrshl k 3))
            (r (fxand k #b111)))
        (fxrshl
         (fxand (bytevector-u8-ref bv q)
                (fxlsh 1 r)) r))))

  (define bytevector-bit-ref? 
    (lambda (bv k)
      (not (zero? (bytevector-bit-ref bv k)))))

  (let ((A-best '())
        (l-best 0)
        (wiz (map word->index words))
        (two^25 (expt 2 25))
        (two^20 (expt 2 20))
        (two^15 (expt 2 15))
        (two^10 (expt 2 10))
        (two^5  (expt 2  5)))
    (let loop1 ((w1 wiz))
      (cond 
       ((not (null? w1))
        (let loop2 ((w2 (cdr w1)))
          (cond 
           ((null? w2)
            (loop1 (cdr w1)))
           (else 
            (let* ((w (car w1))
                   (w* (car w2))
                   (z (fxxor w w*))
                   (m (+ two^20 two^15 two^10 two^5 1))
                   (m* (* two^5 m)))
              (cond
               ((not (zero? (fxand (fxxor (fx- z m) z m) m*)))
                (loop2 (cdr w2)))
               (else 
                (let ((m (vector (fxand z (fx- two^5  1))      ; m_0 
                                 (fxand z (fx- two^10 two^5))  ; m_1
                                 (fxand z (fx- two^15 two^10)) ; m_2
                                 (fxand z (fx- two^20 two^15)) ; m_3
                                 (fxand z (fx- two^25 two^20)) ; m_4
                                 )))
                  (let ((l 1) (A (list w)))
                    (let* ((sw (lambda (j) 
                                 (set! w (fxxor w (vector-ref m j)))
                                 (cond ((bytevector-bit-ref? t w)
                                        (set! l (+ l 1))
                                        (set! A (cons w A))))))
                           (swap-0 (lambda () (sw 0)))
                           (swap-1 (lambda () (swap-0) (sw 1) (swap-0)))
                           (swap-2 (lambda () (swap-1) (sw 2) (swap-1)))
                           (swap-3 (lambda () (swap-2) (sw 3) (swap-2)))
                           (swap-4 (lambda () (swap-3) (sw 4) (swap-3))))
                      (swap-4)
                      (cond ((> l l-best)
                             (display A) (newline)
                             (set! l-best l)
                             (set! A-best A))))))
                (loop2 (cdr w2)))))))))))))

mismatch twixt formals and invocation forms

As a follow up to my yet-another-generator macro post:

yesterday at lunch I was trying to describe my problems generalizing my generator macro to handle arbitrary lambda formals lists. Ryan was the one who summed the problem up best; I think he said "there is a asymmetry between the format of the formals list and the format of invocation"

The problem in essence is that there are two distinct cases that are easy.
If you want to match a proper list of formals and generate a corresponding invocation, you do something like:


(define-syntax adder 
  (syntax-rules () 
    ((_ (args ...) k ...) 
     (lambda (args ...) (+ k ... args ...)))))

And, if you want to do variable arity, you can do it by matching an identifier representing the whole list of arguments


(define-syntax adder 
  (syntax-rules () 
    ((_ argl k ...) 
     (lambda argl (apply + k ... argl)))))

But notice the difference between the bottom and the top. If some users of the adder macro want to be able to specify that they are generating a procedure of arity 2, with appropriate error detection, they can use the first but not the second. And of course, the first does not do variable arity procedures.

To deal with this problem in the context of my generator macro, I introduced a helper macro that builds up an application expression based on the form of the formals list.


(define-syntax build-apply 
  (syntax-rules ()
    ((build-apply (rator rands ...) (args ...)) ;; usual case
     (rator rands ... args ...))
    ((build-apply (rator rands ...) (x . y))    ;; improper list, inductive case
     (build-apply (rator rands ...   x)  y))
    ((build-apply (rator rands ...) rest-args)  ;; improper list, base case
     (apply rator rands ... rest-args))))

The order of all three clauses above is very significant.

With that in place, here is the new form of my generator macro(s):


;; (generator ARGS BODY ...) => procedure 
;;   in whose body yield is bound to coroutine with caller
(define-syntax generator
  (transformer 
   (lambda (exp ren cmp)
     `(,(ren 'generator-via) yield ,(cadr exp) ,@(cddr exp)))))

(define-syntax generator-via
  (syntax-rules ()
    ((generator-via yield-id args body ...)
       (let ((get-next #f))
         (lambda args
           (call-with-current-continuation
            (lambda (exit)
              (cond 
               (get-next (build-apply (get-next exit) args))
               (else (let ((yield-id 
                            (lambda results 
                              (call-with-current-continuation
                               (lambda (next)
                                 (set! get-next (lambda (new-exit . args)
                                                  (set! exit new-exit)
                                                  (build-apply (next) args)))
                                 (apply exit results))))))
                       (call-with-values (lambda () body ...)
                         ;; below could also use build-apply with
                         ;; args, but then the "done-value"(s)
                         ;; returned from a generator would be forced
                         ;; to have value-count = length args
                         (lambda argl 
                           (set! get-next (lambda (new-exit . args)
                                            (apply new-exit argl)))
                           (apply exit argl)))))))))))))

Tuesday, August 25, 2009

yet another yield Scheme macro

I have seen yield done plenty of times before, but I managed to reinvent it again tonight.

A lot of the time when I've played with call/cc, I have approached it trying to recreate a form in a statement-oriented language, and so I would end up not caring about the argument I would pass to the invocation of the captured continuation (because all I cared about was reestablishing the control flow of the prior context; the prior context itself would just discard any value I passed to it). I think the usual interpretation of yield is an instance of this in statement oriented languages.

However, when I was playing with some code tonight, I saw that dummy value and said, "ugly! why is that there?" And I decided to see what would happen if I got rid of it.

This macro is the result. (Obviously you want a bit of sugar around it to non-hygenically bind yield, although there are cases where it is nice to bind a different name like visit instead of yield, depending on the domain.)


(define-syntax generator-via
  (syntax-rules ()
    ;; puzzle 4 U: what role(s) does arg-list serve?  (all occurrences matter)
    ((generator-via yield-id arg-list body ...)
     (let ((yield-id #f) (get-next #f))
       (lambda arg-list
         (call-with-current-continuation
          (lambda (exit)
            (cond (get-next (get-next exit . arg-list))
                  (else (set! yield-id (lambda results 
                                         (call-with-current-continuation
                                          (lambda (next)
                                            (set! get-next 
                                                  (lambda (new-exit . arg-list)
                                                    (set! exit new-exit)
                                                    (next . arg-list)))
                                            (apply exit results)))))
                        (call-with-values (lambda () body ...)
                          ;; puzzle 4 U: why below eta-expansion required?
                          (lambda args (apply exit args))))))))))))

Its got some fun behaviors. Consider:


> (define grows 
    (generator-via yield (x) 
      (let loop ((i 0)) 
        (loop (+ i x (yield i))))))

> (grows 1)
0

> (grows 0)
1

> (grows 0)
2

> (grows 0)
3

> (grows 1)
5

> (grows 0)
6

(yes I just realized I could/should have let-bound the yield-id itself. The other set! invocations are believed to be necessary, barring tricks mixing letrec and call/cc.)

Wednesday, July 29, 2009

(no) interpreters on iphone

I'm collecting blog posts and web pages discussing clause 3.3.2 of the iPhone SDK EULA and semi-related restrictions on iPhone apps.

mobilephonedevelopment article

rhomobile's interpretation of no interpreters rule

story of how a Basic interpreter got downgraded to a calculator (and how to bring the interpreter back if you're willing to jailbreak).

PhoneGap Discussion of arbitrariness of rejection

MacNN coverage of the PhoneGap issue.

Commodore 64 emulator rejection article

FrotZ got through the filter; I asked the developer about that in this discussion

Tuesday, July 21, 2009

git-svn argh!

CCIS decommissioned a number of its Solaris machines recently.

I was using one of them as the point of access for the Larceny SVN repository, via the svn+ssh protocol.

More recently I have been playing with git-svn, using that as my way to interact with the repository. I made a local git-svn clone of the Larceny SVN repository, and then I would clone the clone and make branches in the cloned clones as part of my experimentation with new ideas. Fun fun fun.

Except that CCIS decommissioned a number of its Solaris machines recently, and I was using one of them as the point of access, which mean that the repository url looked something like svn+ssh://zulu.ccs.neu.edu/path-to-repos/, and git-svn (somewhat reasonably) assumes that these URLS would remain valid forever, and puts a copy of that URL into every log message of the clone.

Now, its easy to update those log messages; that's what git-filter-branch is for. (See also this related blog entry.)

But what's not so easy is dealing with the sudden split in the tree structure of the development on my cloned clones; all of my local deltas seem to point to the old changesets with the old bad URL. And "git rebase" does not seem to be magically able to handle this case.

Thus: git-svn argh!

If the SVN folks though that a "svn switch --relocate" command was important, maybe the git-svn developer(s) should also consider their design decisions...

pnkfif

Sunday, November 06, 2011

Middle Squares and Big Nums

`BigNum.as` code

Middle Square Generation

Ugly Hacked Up UI Code

Nicer looking (and shorter) UI code presuming Flash Pro

Saturday, November 05, 2011

A Bit of Trignometry

Saturday, October 29, 2011

Bonjour de Paris

Thursday, August 27, 2009

from 90sec to 9sec.

mismatch twixt formals and invocation forms

Tuesday, August 25, 2009

yet another yield Scheme macro

Wednesday, July 29, 2009

(no) interpreters on iphone

Tuesday, July 21, 2009

git-svn argh!

Followers

Blog Archive

About Me

pnkfif

Sunday, November 06, 2011

BigNum.as code

Middle Square Generation

Ugly Hacked Up UI Code

Nicer looking (and shorter) UI code presuming Flash Pro

Saturday, November 05, 2011

Saturday, October 29, 2011

Thursday, August 27, 2009

Tuesday, August 25, 2009

Wednesday, July 29, 2009

Tuesday, July 21, 2009

Followers

Blog Archive

About Me

`BigNum.as` code