Smalltalk's small syntax

Smalltalk gets a lot of attention for its object-orientedness, which is fine, but I've already learned those lessons, thanks. What I think deserves more attention is its syntax. Much of what is distinctive about Smalltalk doesn't come from its semantics but from its unusually simple, clear syntax. There are several things language designers could learn from it.

Simple infix

In the perennial argument between infix and prefix syntax, almost everyone assumes infix is much more complicated. Smalltalk shows it needn't be. You can read most Smalltalk code with only three kinds of infix expressions, one postfix, and a few other constructs. There are no special features to support infix, because everything is infix, even ordinary user-defined messages.

It's still more complicated than S-expressions, but not by much. If you like infix or dislike superfluous parentheses, it doesn't look like a bad deal.

Keyword arguments

First, an implementation note: the simplest way to handle keyword arguments is to consider them part of the function's name, and have foo(bar=1) be syntax for foo_bar(1). A different set of keyword args calls a different function: foo(baz=1) is foo_baz(1). This approach doesn't work for large numbers of independent optional arguments, but most of the time it's simple and efficient. Calling a function with keyword arguments becomes equivalent to calling one with positional arguments. There's no overhead for passing or processing a table of keyword args. The downside is that you can't do things like pass keyword arguments to an unknown function: (lambda (f) (f :reverse t :silent-error nil)). On the other hand, you can pass functions that take keyword arguments as if they took positional arguments - because they do.

Smalltalk takes this approach thoroughly. All arguments except the receiver are keyword arguments, so all message sends use keywords, and therefore all methods have names like value:with:. Rather than having method definitions define multiple names (one for each combination of optional keyword arguments), Smalltalk just makes all keyword arguments required. If you want optional ones, you have to define each combination separately. That's a bit of an annoyance (and probably discourages people from using optional arguments) but overall this system is easy to understand, easy to implement, and easy to use.

Is it verbose? Yes, a bit. But it's a form of verbosity that lends itself well to self-explanatory code. For people like me, who don't know the language very well, Smalltalk can be surprisingly easy to read, because nearly every argument has a helpful label beside it. It might be slightly longer than in a keywordless language, but it's longer in the right place.

Multiple infix

All these keyword arguments mean that many message sends are multiple infix: arguments alternate with pieces of the operator. This sounds hard to read, like C's ternary operator: a ? b : c. But it's not actually a problem. This is puzzling: if multiple infix isn't hard to parse, why is the ternary operator so unreadable? Is it because ? and : are so short and easily overlooked next to their arguments? Because the ternary operator is the only multiple infix operator in C, and rare, so you don't look for it? Because many programmers forget its precedence and swaddle it in parentheses? Is it only unreadable when it is used in inconveniently large expressions, as seems to happen a lot?

λ

Lambda isn't common when it has a six-letter name, but if it's shorter it is much easier to use casually. Smalltalk uses syntax to abbreviate its lambdas, and it's hard to beat them for terseness, or ubiquity. It also calls them blocks to be less intimidating - remember, this was originally a language for children - and it works: Smalltalkers don't seem to have a lot of trouble learning the mysteries of anonymous functions.

Surprisingly, Smalltalk doesn't abbreviate function call. But lambda is very useful even when calling is verbose, because it's much more common to pass a function as an argument than to accept one. It's especially useful when it's as short as Smalltalk's. The easy lambda alleviates much of the pressure of not having macros. It easily handles three of the most common uses of macros: binding constructs, control structures, and laziness. It's not transparent like macros, and it introduces a distracting asymmetry in expressions like foo and: [bar], but it's a lot better than nothing, and some people prefer the explicit lambdas, at least when they're only two characters long.

Unfortunately, the use of square brackets doesn't mix well with deeply nested parentheses, because of the difficulty of keeping the parentheses matched when editing. Smalltalk doesn't have a lot of parentheses, so this is only a minor problem, and it wouldn't be a problem at all in a structure editor, but it means this approach to lambda doesn't mix well with S-expressions. The brackets are also sometimes easy to overlook, due to their similarity to parentheses. Maybe a different syntax would be better, or even a short name, like, say, λ.

The bad parts

I don't like how Smalltalk handles variable declarations - surely a form of internal define (infix, of course) would be better. I don't like that it requires special syntax for constructing arrays and hashtables. No one likes its syntax for putting definitions in files, which is so clumsy it makes working without the browser practically impossible. There is no need for other languages to repeat these things. But the core of Smalltalk's syntax, its blocks and messages, is clear and convenient and above all simple. If you're designing a syntax, you could do much worse than imitate Smalltalk.

6 comments:

  1. Smalltalk doesn't require special syntax for Arrays and Dictionaries. ("Array with: 2 with: 3 with: 4" and "(Dictionary new) at: #foo put: 3; yourself" for example).

    ReplyDelete
  2. GNU Smalltalk recently released a new syntax that makes it easier to put things in files. It's much cleaner than the ST80 syntax for this sort of thing, and it doesn't sacrifice any of the simplicity. Hopefully other smalltalks will follow suit.

    ReplyDelete
  3. Smalltalk could get along without syntax for arrays and dictionaries, but it wouldn't be very convenient. Constructors for both types want to be terse and take any number of arguments, and there's no way to get either of those properties in Smalltalk without adding syntax. That's the sense in which they require special syntax.

    I kind of wish Smalltalk had general-purpose variable-arity messages, but it's not very pressing since you can fake it pretty well by passing an array.

    ReplyDelete
  4. It's nice that the file syntax is finally being improved after decades. Maybe that will reduce the inconvenience of using it for real problems. It's very convenient to run Smalltalk code from within its own environment, but it's rather hard to call it from the outside world (at least in Squeak). So I've never used it for more than the programming equivalent of small talk.

    Of course, Ruby is another way to make Smalltalk practical. It lacks the parts I like, though - the simplicity and great IDE.

    ReplyDelete
  5. I've been playing with Nu (http://programming.nu), which has a nice combination of the features you mention here. It's a Lisp, with everything almost everything that normally implies including sexps, but it mostly uses Smalltalk's infix notation, so you say:

    (receiver method with:arg)

    It also allows λ for lambda, but that hasn't caught on :-(

    ReplyDelete
  6. Self uses a close variant of Smalltalk syntax, where all keywords after the first must be capitalized: thus "x at: y Put: z" is the equivalent of Smalltalk "x at: y put: z", whereas "x at: y put: z" would be "x at: (y put: z)" in Smalltalk. I've never actually had to get used to this, so I don't know if it's good or bad.

    ReplyDelete

It's OK to comment on old posts.