Lisp without lists

Any significant program uses collections heavily, so most languages have several collection types - usually at least one sequence type and one dictionary type. Many languages have more, but library effort is limited, so every language omits some useful collections. As long as there are good alternatives, it's not a big problem.

Ruby doesn't have lists. RLisp borrows Ruby's data structures, so it doesn't have lists either. But it's a Lisp dialect! How can you have a Lisp without lists?

Pretty easily, it turns out. Despite their historical importance, lists don't have much to do with Lisp being Lisp. Their most important use is to represent S-expressions - but they aren't the only data structure that can do that. The important feature of S-expressions isn't that they read as lists - it's that they read as something simple and flexible and easy to work with. Vectors have all of these advantages, so RLisp programs are made of vectors instead of lists.

This is not traditional, but it's not a bad idea. Sure, cons and cdr are more expensive, and those are common operations on code, but performance of code manipulation is rarely a problem. And vectors are smaller than lists, which recovers some speed, and makes it cheaper to keep code around for debuggability. So I think other new Lisps should consider using vectors instead of lists.

I don't mean to suggest that lists aren't important. They are the functional sequence data structure, and every language that intends to support functional programming should have them. But RLisp is in its infancy, and it's forgivable if it doesn't have lists yet.

On the other hand, there's no excuse for its convention of writing close parentheses on their own lines. They look so lonely! More to the point, they waste space, separating the lines that have actual code. This is just as annoying in Lisp as it is in C, and there's no reason to do it in Lisp. Let indentation show program structure, and leave the parentheses at the ends of lines, regardless of whether they delimit vectors or lists.

Alternating lists

I complained about the awkwardness of processing alternating lists, such as in (let (var1 form1 var2 form2) ...), because ordinary collection operations don't understand them. But this is easy to fix. Arc's assignment operator uses an alternating list: (= a 1 b 2) is (do (= a 1) (= b 2)), just like setf in Common Lisp. So Arc has a function pair to convert (a 1 b 2) to ((a 1) (b 2)). A look through the Arc source shows it's always used with some sort of map, so there's still some repetition to remove. But it reduces the added difficulty of processing alternating lists to one function call. The inverse operation is easily accomplished with some sort of mappend. So difficulty of processing alternating lists is not really a significant problem.

My other objection was that they're hard to read. But alternating lists turn up in other places: keyword arguments, plists, #s(...) syntax - anywhere you need to write a dictionary in sexprs. I don't find any of these especially hard to read, so I suspect my discomfort with alternating let is just unfamiliarity. I suppose I should get used to it and stop complaining.

The sound of logic

Turning arbitrary data into sound or pictures can be an easy way of getting insight into it, because human brains are so good at picking out patterns in those forms. It can also be fun, as seen in Doug McIlroy's talk about the history of computing at Bell Labs:

It was customary for computer operators, for the benefit of computer operators, to put a loudspeaker on the low bit of some register on the machine, and normally the operator would just hear kind of white noise. But if you got into a loop, suddenly the machine would scream, and this signal could be used to the operator "oh the machines in a loop. Go stop it and go on to the next job." I remember feeding them an Ackermann's function routine once.

These days registers change a bit too fast for that. But I still like to listen in on a machine. I listen to the disk rattle, and miss it when I'm using a remote machine. I often wish I had something more informative to listen to or look at. Not predigested data like the CPU graph. Just raw data, so I can accidentally notice what's going on, without having to think about it. Page faults, system calls, memory allocation, stack height - things that vary with what the machine is doing, but not in any simple way. Things best interpreted not by conscious thought, but by unconscious pattern recognition, because you have hardware for that.

RLisp and language integration

Now that the importance of libraries is widely appreciated, most new languages make a point of having good ones. Usually they get them the easy way: they borrow someone else's, by being closely integrated with their host languages. At a minimum this usually means sharing the same data and being able to call code of the host language. That's all you need to borrow libraries, and it works fine. But integration can go further.

RLisp is a Lisp integrated into Ruby. The two languages already have virtually the same data model, so that part is easy. Calling Ruby code has a small complication: Ruby is based on message send, not on function call. RLisp deals with this by supporting both. In addition to the ordinary funcall form, it has a send function, for sending Ruby messages. And there's syntax to make it convenient: [a + b] reads as (send a '+ b).

Since most RLisp calls are to Ruby methods, send is actually much more common than regular funcall. So much so that my first impression, on reading some RLisp code, was that they were backward: the Form Without A Name should be send, and funcall should be explicit. This is not the traditional Lisp way, but is it a bad way?

defun would have to define methods instead of functions, which would also make it easier for Ruby to call RLisp. It would be tempting to impose Ruby's restrictions on Lisp names, to minimize the hassles when calling back and forth. The main annoyance would be the loss of high-order-ness: ordinary methods would no longer be suitable as arguments for high-order functions, which would require an explicit lambda (or an η-expanding function macro - or could this be implicit?) and funcall. But that's no worse than in Common Lisp - or more precisely, in Smalltalk or Ruby.

The resulting language would be unusually closely integrated with Ruby, by borrowing its namespaces and even its Form Without A Name from the host language. In fact, it wouldn't differ much from Ruby except in syntax. But since the motivation for RLisp appears to be "Ruby with macros", this similarity is not a bad thing. If RLisp became essentially an S-expression syntax for Ruby, maybe with the rough corners cleaned up, that wold probably increase its utility, because it would be easier for users to move between the two languages.

One could make this even easier by further integration: defining a syntax for RLisp which looked exactly like Ruby, but read as S-expressions. The result would be a dialect of Ruby which happened to support macros. Not what either Matz or taw have in mind, perhaps, but it might be the easiest way to add macros to Ruby.

What was the problem with internal define in Arc?

According to Paul Graham, an early version of Arc had internal define, but there was a problem:

In a language with implicit local variables and macros, you're always tripping over unexpected lexical contours. You don't want to create new lexical contours without announcing it. But a lot of macros that don't look like blocks in the call expand into blocks. So we provided a second block operator, called justdo, which was like do but didn't create a new lexical contour (i.e. it is Common Lisp progn), and this is what you were supposed to use in macroexpansions.

The trouble was, I kept forgetting and using do instead. And I was thereby writing utilities with the worst sort of bug: the kind that might not show up for years, and only then in someone else's code.

Huh? My own Lisp dialect also has internal define, and I haven't had a problem with unexpected contours. Neither have thousands of Schemers. (They might point out the disagreement over define in let-syntax, and the difficulty of writing macros that expand into multiple defines, but I don't think anyone counts these among the biggest problems with the language.) I haven't even had trouble remembering to use justdo in my Lisp, possibly because it has the more distinctive name splice. What are these problems I'm supposed to be having?

I'm also surprised by how complicated the implementation was said to be:

We wrote a hideously complicated interpreter that allowed local variables to be declared this way. The ugliness of this code worried me: ugly things are generally a bad idea.

I suspect they just implemented it the wrong way. You can implement internal define quite simply by defining begin (including implicit begin!) as a macro which transforms define to letrec*. Most Schemes do it this way, and my own Lisp does too, more consistently: basic special forms like lambda and begin are actually macros over more primitive ones. It feels a little odd at first, but once you stop expecting to write in primitives all the time, it's quite comfortable. You get internal define and other macro-based luxuries without compromising the simplicity of the language kernel.

My experience with internal define has been almost entirely positive. This is so different from Graham and Morris' that I wonder if they were really doing something else. Now, how did they describe it?

In Arc we were planning to let users declare local variables implicitly, just by assigning values to them.

Were they actually creating variables by assignment? This is well known not to work, and for reasons having nothing to do with macros — ask a Python user about global. The Arc designers couldn't be repeating this old mistake. Could they?

You know you think about usability too much when...

Last night I dreamed I was hurriedly trying to type something into a cellphone. It had a clever keyboard layout, designed to minimize the number of presses for common letters at the expense of rare ones, so ETAONRISH were just one press each. Fortunately I never had to type Q.

So far, so good. There was just one little problem: it didn't match the keys. They were labeled with the standard layout, where the common letter S takes four presses. The phone worked around this problem by showing the captions on an onscreen keyboard, for the convenience of beginners like me.

Unfortunately the ones it showed belonged to a different clever layout - one that tried to reduce keystrokes by spreading them more uniformly across all twelve keys. (This doesn't make a lot of sense, but I was asleep, okay?) So I was reduced to guessing, while both keyboards helpfully misled me, and the efficient layout was for naught.

Remember those easy, carefree flying dreams? I don't have those anymore. Instead I dream about difficulties controlling my flight. This is what happens when you think about usability problems too much. You start to dream bad user interfaces.

Clojure

While I'm talking about new Lisps, I should mention Clojure, a new Lisp with close JVM integration and fancy concurrency support. I've been meaning to write something in it and post about the experience, but it may be a while before I get around to that. So here's the quick summary:

Clojure has the usual features of modern Lisps: lisp-1-ness, case-sensitivity (because it's hard to do case-insensitivity right), distinguishing () from nil from |nil|, pure symbols, shorter names, modules. It's quite clean, although Lispers should watch out for the renamings (familiar names like cons and do aren't what you expect). There is destructuring in binding constructs, and lambda is really case-lambda. Its Lispy core looks quite good (although as I said, I haven't used it much yet), and it closely resembles the orthodox modern Lisp.

Except for one thing: most data is immutable. Clojure aims at the most popular open problem in languages today: safe concurrency. Like Erlang, it tries to provide state only in forms that are easier to use safely in concurrent programs. Clojure has three: mutable special variables (since their bindings are per-thread), software transactional memory, and reactive message-passing. (See those pages for explanations and examples. There's also a wiki with more examples.) What it doesn't have is the ordinary mutation we take for granted in other Lisps. There's no setf, no mutable arrays or hashtables, no push.

Fortunately there's a lot of careful support for immutable collections, including syntax for [vectors] and {maps}, and generic iteration (even on Java collections!). It includes some useful functions that are often forgotten, such as mapcat and range. Clojure may not have state, but it tries to do the alternatives well enough that you don't feel the lack.

The ultimate alternative to language limitations is an FFI. Clojure is implemented in Java and runs on the JVM, so its FFI takes the form of extensive Java integration: nil is Java null, some interfaces are supported (in both directions!), and it's very easy to call Java code. There are some downsides to staying so close to Java: there's no tail-call optimization, and since threads are Java threads, their number is limited (potentially annoying, in a concurrent language). The upside is that it inherits all sorts of functionality from Java - the VM, massive libraries, even mutable data if you need it. This goes a long way toward explaining the completeness and high quality of the implementation.

There's something I'm wondering about, though. How do you pronounce Clojure? Same as "closure"? Or with /ʤ/ as in "Java"?