My latest hair-pulling problem. (or: learn more about Lisp than you care to)

I’m just logging this up here because someday, someone will ask for a war story about programming in the trenches, and I won’t remember enough details.

So: I’ve been working on a fork of the JSCL compiler, which is a (hopefully, someday) ANSI Common Lisp compiler, which writes out Javascript code. That means you could, someday, have all the power of a full Common Lisp implementation behind you and develop software to run in a web browser, which would be awesome.

I’m also trying to “chunk” my changes into big “code drops” back on the upstream project, but those can take a while (and some discussion) before they get adopted, so I’m moving along in the interim. Admittedly, part of the problem is sometimes that one change requires other changes to go with it, in order to work, and accepting rather large code drops like that can be troublesome to review.

Where this leaves me is standing atop a rather different copy of JSCL than upstream — to the tune of 9,015 lines from “git diff,” meaning something on the order of 4,500 changed lines or so, in a codebase of only 10,818 lines total; at least a 40% change. Rather significant, in fact.

Many of these changes I’d categorize as improvements, rather than bug fixes: expanding ANSI compliance (the 1990 ANSI Common Lisp standard, as expressed by the Common Lisp HyperSpec at www.lispworks.com/documentation/HyperSpec/Front/ is the canonical “baseline” by which any Common Lisp implementation is to be judged), as well as providing for a cross-compilation from a running Common Lisp environment into the JSCL-mediated Javascript one.

It’s that last case that’s giving me strange problems.

Common Lisp is built upon only 25 Special Forms. (A “Form,” in Lisp parlance, is the basic building-block of a program; in the days of assembly-language, these would be “lines,” but Lisp’s free-format and redefinable syntax lets the programmer break lines at the margin and combine instructions much more freely than that.) While JSCL does not yet flawlessly handle all these forms, once it does, it will be possible to implement nearly all of the rest of the standard library atop them. (A few things, like file I/O, will require special integrations via Foreign Function Interface, in this case, to Javascript.)

Writing the code for JSCL’s compiler is, then, currently an exercise in writing using cognates, avoiding bits of the language that don’t operate correctly in JSCL yet, and ensuring that everything runs just as well in the “host” compiler — in this case, SBCL. There are two things at play: native compilation, where the code is being compiled by SBCL into x86-64 binary machine language that runs on my computer (or my web application server), and cross-compilation, where the code being compiled is eventually transformed into Javascript to run in a web browser (or Node.js).

The problem is that, during native compilation, there is not really a boundary between the compiler’s running environment and the code being compiled. When I define a new macro function in a CL program, I can then “immediately” use that macro in the next form being compiled. A lot of things are implemented in terms of macros, including many parts of the standard library itself. A significant fact is that a macro function receives as its input an asbstract syntax tree of the uncompiled program body that it encloses. Let me explain that a little bit.

When you apply a function to some arguments in a simple case, the CL standard (as with most languages) says that each argument is evaluated and its return value is what is (eventually) passed to the calling function. A literal value, like a number, evaluates to itself, so a simple imaginary call like add[4,5] would take the literal values “4” and “5” and pass them as inputs to a function add. However, assuming that add does the obvious thing, then it will return the value 9. Every form in CL can return 0 or more values of any type: numbers, strings, complex objects.

So, we can nest the call: subtract[add[4,5],7] would give the function subtract two inputs, one of which is the result of evaluating add[4,5], the other being the literal 7. The actual machine-language instructions for this might be something like: load r0, 4; load r1, 5; add r0, r1 into r2; load r3, 7; subtract r2, r3 into r4;

Now, in Lisp we write functions a little differently than the above (M-expression notation), so the same expression would be written in S-expression notation (for syntax expression) as (subtract (add 4 5) 7) — or, in fact, using the built-in functions — (- (+ 4 5) 7). (Yes, the actual names of the functions are “” and “+” in CL, but unlike some languages, they’re not considered “special operators” with their own syntax, so we just call them like any other functions.

So, the following are all the same expression:

Math notation: ((4 + 5) – 7)

M-expression: subtract[add[4,5],7]

Message-passing form: 4.add(5).subtract(7)

Math Polish notation: (- (+ 4 5) 7)

Reverse Polish notation: ((4 5 +) 7 -)

S-expression: (subtract (add 4 5) 7) = (- (+ 4 5) 7)

Note that Syntax Expressions are built around the Polish notation from maths, where stack-oriented languages like Forth are based on Reverse Polish notation.

So what makes S-expressions so much better? Why would decades of Lispers use a notation that’s so different from the M-notation and message-passing forms used in languages like C and Java?

It’s all about the macros.

A macro-function, as I mentioned earlier, does not evaluate its arguments. It turns out that within most compilers programs are represented in a form called an abstract syntax tree … which happens to be nearly identical to the S-expression form (and quite similar to the M-expression form, but not quite). The input to a macro function is the actual S-expression.

One of the basic building-blocks of Lisp is the if special form. It takes three parts: a test, and two possible outcomes. You’ve probably seen it in its M-expression form in a spreadsheet program:

=if[a5=b5;c5;0]

In Lisp, it’s written:

(if (this-is-true-p) (do-this) (otherwise-that))

The predicate test function is this-is-true-p, and it has no arguments passed to it. (If it did, it might be written: (this-is-true-p arg1 arg2) or so.) What makes if a special form is that it will only evaluate one of its two outcomes.

Let’s make it interesting.

(if (button-pressed-p)
(launch-nuclear-missiles)
(show-happy-face))

If if were a normal function, we would evaluate the three functions, and collect their return values:

(button-pressed-p) ⇒ nil (meaning “false,” “nothing”)
(launch-nuclear-missiles) ⇒ 0 (meaning “no survivors”)
(show-happy-face) ⇒ “☺”

… and then, we would call the function if with the results of those evaluations:

(if nil 0 “☺”)

In hypothetical machine-language form:

call button-pressed-p, result into r0;
call launch-nuclear-missiles, result into r1;
call show-happy-face, result into r2;
call if with r0, r1, r2, result into r3;
return r3;

Clearly, this would be a Very Bad Thing.

Because if is a special form, it only evaluates the outcome that is decided upon by the predicate.

call button-pressed-p, result into r0;
if r0 is eq to nil, jump to L2, else, jump to L1;
L1:
call launch-nuclear-missiles, result into r1;
jump to L3;
L2:
call show-happy-face, result into r1;
continue to L3;
L3:
return r1;

This is clearly what we needed, here.

Macros allow us to expand the syntax of Lisp by building upon these special forms. A macro takes in its unevaluated arguments, and produces as its output an S-expression (or abstract syntax tree), which in turn, is given to the compiler.

Let’s look at when or unless. Often, we don’t need the full “power” of if, but we might want to carry out multiple steps. The when form is a standard macro that lets us carry out one or more steps only if the predicate is true. (Any value other than nil is “true” for this purpose.) unless does the opposite, only running if the predicate is false (nil).

(when (button-pressed-p)
(start-sprinklers)
(fresh-line)
(princ “I have started the sprinklers.”))

when can be written in terms of the special form if. Here, progn is a special form that groups together “n” forms (any arbitrary number of forms) into a program and runs them all, returning the last form’s value.

(defmacro when (predicate &body body)
`(if ,predicate (progn ,@body) nil))

The ` and ,@ are special syntax that expands into the longer internal form:

(defmacro when (predicate &body body)
(list ‘if predicate (cons ‘progn body) nil))

Dodging a full explanation of the syntax, this means that the above example is transformed into this:

(if (button-pressed-p)
(progn (start-sprinklers)
(fresh-line)
(princ “I have started the sprinkers.”))
nil)

The transformation makes the program code possible to write in a more expressive way. This is a pretty trivial example: many macros are far more complicated.

Writing macro functions lets a Lisp programmer never have to repeat themself, and make the syntax of the language adapt to whatever we need it to do.

Note, however, that when (like if) cannot be expressed as a normal function, because its outcome would be evaluated first, and then passed in to it.

Now, after all of those lessons, why is this a problem for me and JSCL?

Well, is it turns out, my cross-compilation environment does not manipulate the native environment. When my native compiler evaluates a defmacro form, that function is immediately available to it to run. The contents of the macro code are converted into native machine language, and it can be called like any other function, given the AST as its input.

Here is the actual AMD64 assembly-language code for SBCL’s when:

 (disassemble 'when)
; disassembly for (:MACRO WHEN)
; Size: 136 bytes. Origin: #x1000AA6693
; 693:       4C894DF8         MOV [RBP-8], R9                 ; no-arg-parsing entry point
; 697:       488D7424F0       LEA RSI, [RSP-16]
; 69C:       4883EC18         SUB RSP, 24
; 6A0:       498BD1           MOV RDX, R9
; 6A3:       BF02000000       MOV EDI, 2
; 6A8:       488B0599FDFFFF   MOV RAX, [RIP-615]              ; #<FDEFINITION for SB-C::GET-INFO-VALUE>
; 6AF:       B904000000       MOV ECX, 4
; 6B4:       48892E           MOV [RSI], RBP
; 6B7:       488BEE           MOV RBP, RSI
; 6BA:       FF5009           CALL QWORD PTR [RAX+9]
; 6BD:       4C8B4DF8         MOV R9, [RBP-8]
; 6C1:       488BFA           MOV RDI, RDX
; 6C4:       488B35A5FDFFFF   MOV RSI, [RIP-603]              ; 'SB-INT:SPECIAL-FORM-FUNCTION
; 6CB:       488B0D86FDFFFF   MOV RCX, [RIP-634]              ; 'UNDEFINED-FUNCTION
; 6D2:       483B3D77FDFFFF   CMP RDI, [RIP-649]              ; :SPECIAL-FORM
; 6D9:       480F44CE         CMOVEQ RCX, RSI
; 6DD:       4C8D4424F0       LEA R8, [RSP-16]
; 6E2:       4883EC18         SUB RSP, 24
; 6E6:       488BD1           MOV RDX, RCX
; 6E9:       488B3D70FDFFFF   MOV RDI, [RIP-656]              ; :NAME
; 6F0:       498BF1           MOV RSI, R9
; 6F3:       488B056EFDFFFF   MOV RAX, [RIP-658]              ; #<FDEFINITION for ERROR>
; 6FA:       B906000000       MOV ECX, 6
; 6FF:       498928           MOV [R8], RBP
; 702:       498BE8           MOV RBP, R8
; 705:       FF5009           CALL QWORD PTR [RAX+9]
; 708:       90               NOP
; 709:       CC10             BREAK 16                        ; Invalid argument count trap
; 70B:       6A20             PUSH 32
; 70D:       B930264200       MOV ECX, 4335152                ; alloc_tramp
; 712:       FFD1             CALL RCX
; 714:       415B             POP R11
; 716:       E91AFEFFFF       JMP #x1000AA6535

No, I don’t expect anyone reading this to understand the above, I’m just making the point that it’s decidedly native. That’s the kind of code that only a CPU could love. It also runs blazingly fast, even with the error-checking code you can see hinted-at in the disassembly listing.

When cross-compiling, though, I’m not producing native code — I’m writing Javascript. That is the goal, though, isn’t it? If I sent the above code to a web browser, it would choke. Thankfully, web browsers don’t accept native code from strangers and run it on your CPU unfiltered.

In the case of macros, though, I now can’t invoke them on the cross-compiler. Or can I?

There are really two possible solutions that present themselves here. The more obvious one, perhaps, is to take the generated Javascript code and just run it on my server. One problem with that is that the generated Javascript is slow, by Lisp standards. I can macroexpand when many thousands of time over in the time it would take to perform the same work in Javascript, in the best case. Another problem would be to actually put together the “plumbing” between the Javascript compiler/interpreter and the hosting Lisp environment, which (no matter how it’s done) would not exactly be “pretty.”

The alternative (and the one I’m going for) is to cheat (of course).

To play an honest game, you have to be good at solving puzzles. But to cheat, you have to be great at solving code. Those are the guys I need on my team… the ones who can break into the code, find the back doors, figure out “Plover” and “Xyzzy” and “Fee Fie Foe Foo” to get it done. — Cameron Howe, Halt and Catch Fire, “Adventure”

So, in this bizarre world, the answer is … do everything twice.

Think about that for a second. In order to get a faster response, we want to do all the work … twice.

It’s actually kinda neat, when you think about it. When we’re compiling a file, we evaluate each top-level form. Usually, that’s going to be a form with some side effects on the environment itself: we might define a new function, or macro, or variable, or “whatever.”

In order to run the macro function, it may in turn depend upon other macros, special forms, or normal functions in the environment. When all of these happen to be a part of the Common Lisp standard package (named, cleverly enough, Common-Lisp), then they should exist with identical behaviour in both the native (SBCL) and cross-compiler-targeted (JSCL) environments — although, as noted, JSCL is not yet fully ANSI-compatible, we’re working on the deficiencies when they’re discovered.

But Lisp compilation happens into packages. I can have a function named start, and you can have your own function called start, and as long as we define them in their own packages, nary the twain shall meet. We can export the ones we intend to share, and others can refer to them using the package-name as a qualifier, though; so if my package is Global-Thermonuclear-War and yours is Cool-Sprinklers, someone can see clearly that Global-Thermonuclear-War:Start is a different function that Cool-Sprinklers:Start.

That means that all of these functions and macros being compiled by JSCL into Javascript code to run in the browser, can also be simultaneously compiled by SBCL into native machine code to run on the host! Even the bits of JSCL that might otherwise conflict with SBCL — The Common-Lisp package itself — only need to be redirected instead into a package with a different name (say, JSCL/Common-Lisp) and nobody has to be the wiser.

And there’s the answer. Not only do I get code for the web browser, but I get shared functions, many of which I could (if I so chose) evaluate on the host side. (Of course, only the ones that don’t interact with any browser-specific bits, like reading the keyboard or drawing in WebGL; unless I define those things on the host some day.)

Problem explained and (potentially) solved. Although this little indicator will let you know when my fix has really made it:

As I write this, that still reads build error, but hopefully soon it’ll flip over to show success.

What do you think? Add your comments →