Arcane Sentiment: July 2012

Chaining comparisons

Mathematical notation often uses more than one comparison operator in a row:

0 ≤ x < n

Anyone who tries that in C is in for a surprise (and possibly some frustrating debugging, since this sort of bug is easy to not see), but some languages do support it. I know seven ways to do it:

N-ary comparison: In most Lisps, comparison operators can take more than two arguments. This doesn't solve the whole problem, since it only covers chains that use two calls to one operator, or that can easily be rewritten to use one operator (e.g. 0 ≤ x < n as 0 ≤ x ≤ n - 1), but it helps, and it doesn't complicate the kernel. It doesn't work well with infix, though.
Combinator: Something like chain a f b g c = f a b && g b c enables writing (chain 0 ≤ x < n), if you don't mind the distraction of an explicit combinator call. In an infix language where it's awkward to pass comparison operators as arguments, you may be able to improve on this by making chain a macro.
Macros: Comparison operators can be macros which examine their arguments to see if they're calls to other comparison operators, and if so, expand to chain them. This requires a language with macros and infix (and in which booleans aren't possible arguments to comparisons), which explains why I don't know of any language that does it. In such a language, though, it's an easy way to provide fullly general chaining in user code, without kernel support, at the price of turning all the comparison operators into macros. If the language has a feature that rewrites function calls, like a more reliable version of Common Lisp's compiler-macros, then that price can be avoided.
Multiple infix: Some languages (e.g. Smalltalk, although it doesn't cover this case) support parsing 0 ≤ x < n as a call to a single ≤< operator with three arguments. In such languages, you can define that operator to be equivalent to (0 ≤ x && x < n). This doesn't scale well to more operators than the basic six, but if you already have multiple infix, it's easy to cover all the common cases of chaining.
Associativity option: If you have user-defined infix operators with a choice of left, right, or no associativity, you can add chaining as a fourth associativity option. This is what Perl 6 does.
Separate channels: In Icon, comparison operators give their Boolean result by success or failure, leaving the return value available for something else. So they return their right argument, and (0 ≤ x) < n works. This requires unusual language semantics, though.
Special case: Python simply has a special case for this, just as it has other special cases to make Boolean expressions look more familiar (e.g. is not).

The language-design pattern behind chaining is to redefine an implausible construct (in this case, comparing the result of a comparison) to express some more useful meaning instead. Usually the implausible construct is an error case — many language features assign meaning to otherwise meaningless operations, and this isn't hard to get right, because there's no question about whether the construct really was implausible.

It's much harder to apply in cases like chaining, where the implausible construct is not actually invalid: it's hard to tell how unlikely it is, and if it appears anyway (as e.g. a == (b < c) occasionally does), the semantic irregularity hurts. Irregularity is a problem even when redefining error cases, because it can produce incoherent semantics, like + overloaded for arithmetic and catenation. So this is not an easy pattern to follow, but it's one that has produced useful results.

Sinful total division

You know how annoyingly often you want to display a ratio — a percentage or framerate or some such — and you don't care what happens if you divide by zero, because the result isn't meaningful then anyway? But you don't want it to crash, so you end up writing things like this, over and over...

printf("did %d things (%d%% hard) at %d/s\n",
       n, n ? nhard * 100 / n : 0, dt ? n / dt : 0)

If you're writing it over and over, it should be a function, of course:

div0 a b = if (zero? b) 0 (a / b)

This is a sinful function: it doesn't help you do the Right Thing, only helps you be sloppy more conveniently. Programmers tend to feel guilty about writing such functions, and especially about including them in public interfaces, lest they encourage someone else to be sloppy. Often that guilt deters us from writing them at all, even when we plan to be sloppy anyway, so there would be no harm in it. Language designers avoid such features even more scrupulously, and lament the ones they're forced to include for backward compatibility. Even if div0 is frequently useful, you won't see it in many languages' standard libraries¹.

On the other hand, a large class of language features are valuable because they support seemingly sloppy practices. Garbage collection lets you leak memory with impunity, dynamic redefinition lets you modify a program while it's running, automated tests let you use code you can't prove correct — so rejecting features that promote sloppiness is not a very good heuristic. I suspect a better understanding is possible here, but I don't have it yet.

¹ Although there probably is some language which lacks a good way to report exceptions, and therefore has div0 as the standard division operator.

Lessons from Fortress

Guy Steele says the Fortress project is “wrapping up”, apparently because further implementation work is not expected to be very enlightening. I suppose this is a good time to get round to blogging about it...

Alan Eliasen asks for commentary and lessons learned. Here are some things I've learned from Fortress:

If your language includes a flashy experimental syntax, that will dominate perceptions of it, even by knowledgeable and interested people. Fortress includes a lot of potentially useful work about compiling a flexible dynamically-typed language efficiently, but who notices?
If you describe generic functions as dynamically resolved overloading, they no longer look weird to O-O programmers. Indeed, they hardly look like a feature at all; they look like the natural way overloading should work. You can even get away with adding them to a non-Lisp language!
There's surprisingly little connection between language features and target audience. If the Fortress documentation didn't say it was intended as a replacement for Fortran, I would never have guessed. The mathematical syntax and many numeric types hint at this, but not strongly enough to overcome the dynamic typing and generic functions.
Batch compilation is a reasonable thing. Dynamism is traditionally associated with other forms of power and flexibility, but it's OK to do without it (or offer a batch mode that disables it) for the sake of optimization.
It's nice to clearly document what features of your language are new or unusual or special, for the convenience of readers who are interested in it as research. Otherwise they may overlook your new ideas among the rest, especially if the language is a complex one like Fortress.
Features that express something nonlocally can also have local versions. IIRC, old versions of Fortress had a dispatch form that had the same semantics as method dispatch, but in a typecase-like form without the nonlocality of method definition. (It doesn't seem to exist in the current version, though. Maybe I'm hallucinating and it never existed.)
The stated purpose of a language affects how interesting I think it is. Fortress is mostly a compilation research project, but it claims to be a replacement for Fortran, which makes it sound much less interesting than it really is.

If most of these lessons are about documentation and perception rather than about the language itself, that's because I've never used Fortress, only read its documentation and some of the blog posts (which, unfortunately, have vanished, along with most of the website, but may be restored someday).

current-jiffy is not usable portably

In addition to current-second, the R7RS draft adds a timer function, current-jiffy. Obviously the Theoretical Right Thing for a timer is a high-resolution version of the same linear timescale used for dates, which current-second can (in an ideal implementation) provide. But in some applications (especially the rather important one of profiling), the timer can be called very frequently, so it must avoid allocation, which means either returning a fixnum or modifying some sort of mutable timer object. This is the point of current-jiffy.¹

The main use case is something like this:

;;; Evaluate the body. Return multiple values: the time taken in
;;; seconds, followed by whatever values the body returned.
;;; This returns the time in seconds, not jiffies, for human
;;; readability. It returns the time as a number, rather than
;;; printing it somewhere, so it can be used programmatically.
(define-syntax time
  (syntax-rules ()
    ((time forms ...)
     (let ((start (current-jiffy)))
       (let-values ((vals (begin forms ...)))
         (apply values (/ (mod (- (current-jiffy) start) jiffy-modulus)
                          (jiffies-per-second))
                       vals))))))

> (time (/ (factorial 100) (factorial 99)))
0.0124525432
100 ;hopefully

Presumably R7RS-large will include such a macro. (time? elapsed-time? profile? bench?) But in the current R7RS-small, users can't portably write it, or do anything else with jiffies reliably, because they can't deal with overflow.

High-precision fixnum timers overflow fast. A 32-bit microsecond clock, for instance, overflows every 71 minutes, and a 30-bit fixnum every 17 minutes. This is too frequent to ignore, so all subtraction and comparison of jiffies needs to be modulo the word size, so all programs using current-jiffy need to know jiffy-modulus. In theory they could determine it by experiment, but that takes a while and is imprecise, so in practice portable code can't use current-jiffy for anything but setting random seeds.

This could be fixed by adding jiffy-modulus (jiffy-limit? max-jiffies?) to the standard. But since current-second already covers timing fairly well, it might be better to leave current-jiffy to R7RS-large. Efficient timers are a useful feature, but not one that belongs in Small Scheme.

By the way, shouldn't jiffies-per-second be a constant, not a function? It's in the section called “Standard Procedures”, which does not describe a numeric constant, but it's better to change the title of the section (Standard Definitions? Standard Procedures and Constants?) than to change the language to fit the title.

¹IIUC it originally had another point: to provide a linear timescale even if current-second didn't. But now that current-second is supposed to provide TAI, current-jiffy is no longer needed for this, and even if current-second only provides disguised POSIX time, its nonlinearities are infrequent enough for most applications to ignore.

The Scheme that specifies TAI is not the eternal Scheme

The R7RS draft fixes a longstanding hole by adding the simplest possible time function: current-second, which returns the time since 1970 in seconds.

Unfortunately, it specifies that the timescale used is TAI. This is overspecification. The advantage of TAI over POSIX time is linearity (POSIX time jumps backward at leap seconds, so it's ambiguous, and time arithmetic gives incorrect results), but R7RS acknowledges that most implementations will compute it by simply subtracting a constant from their system clock. In addition to not being correct TAI and being gratuitously incompatible with POSIX, the resulting timescale remains ambiguous and nonlinear, so it has no advantage over POSIX time. Why specify a feature when you know it won't work?

I agree that the POSIX timescale is messed up, but this is not something languages can fix. As long as time is provided by the operating system, the choice of timescale is up to the operating system.

Also, time standards have historically been replaced fairly often, so Scheme risks specifying an obsolete standard which no one else uses. It's safer to simply leave the timescale and epoch unspecified, and let current-second be the system clock, whatever it is.

Oft-omitted features should be optional

Language implementors struggle with specifications. Those writing to a spec generally want to follow it, but unlike the authors of the spec, they face problems of implementability. When a standard specifies a feature that is too difficult to implement, or that interferes with performance or other desiderata, some implementors will simply omit it, whatever the language standard says.

Scheme has two features that are particularly annoying to implementors, because they affect how procedure calls are compiled: call-with-current-continuation and tail call optimization. Many implementations, particularly those that aim to interoperate with another language, don't implement them. In JScheme and Kawa, continuations only work outward: call/cc is call/ec. Kawa optimizes local tail recursion, but not tail calls in general. Their implementors would have preferred to follow R5RS, but they decided it was not practical to do so — they felt performance and compatibility with Java were more important than strict compliance with the standard.

To reduce the burden on implementors, the Scheme Reports have long declared some features optional. Most of the numeric tower is optional, and various standard procedures have been marked as optional in some versions. This has been a great help to implementors, who have been able to implement Scheme without worrying about little-used features like complex arithmetic, and has probably contributed to the large number of implementations.

Full continuations and nonlocal tail-call optimization are also little-used features — despite their general utility, few programs rely on either. Implementors know this, which is why they're willing to omit them, even if the standard does not. Users do not appear to consider this omission a serious flaw; JScheme and Kawa are generally considered serious, full-featured implementations, not incomplete toys. In existing practice, call-with-current-continuation and tail call optimization are optional.

Oleg has already proposed making call/cc officially optional, and I agree. Implementors and users treat it as optional, so the standard should too.

In Oleg's proposal, both call/cc and dynamic-wind are optional, and can be omitted entirely. This is unnecessarily disruptive, as it breaks the many programs that use call/cc only for escape continuations and dynamic-wind only for unwind-protect. I suggest a more conservative change: require call/cc to provide at least escape continuations, but make full reusable continuations optional. (There should be a cond-expand feature for this — full-continuations? full-call/cc? reusable-continuations? indefinite-extent-continuations? Just call-with-current-continuation?) This preserves compatibility for almost all programs, while removing one of the greatest burdens on implementors, so there will continue to be many good implementations of Scheme, even as it standardizes other implementation challenges like define-record-type and parameters and exceptions.

The same goes for other features that are difficult to implement. If neither implementors nor users think a feature is necessary, the language standard should acknowledge this, rather than condemn part of its community as noncompliant.

Kernel, library, and primitives

It's often useful to distinguish between parts of a language based on whether they can be expressed in user code:

The kernel is the part the compiler has to know about.
The library is the part that can be implemented in user code.
Primitives are the parts that are provided by the implementation, but could be expressed in user code.

This is an obvious and possibly familiar distinction, but it's underused. It should be part of the standard vocabulary for talking about languages.

Some examples in Scheme:

let is library, because it can be implemented as a macro. (It could be implemented as a macro in Common Lisp too, but for historical reasons it's in the kernel.)
procedure? is primitive, because while it could be implemented in user code, there's nothing it could be implemented in terms of. In a language with type-of, it's library.
The Form Without A Name (the function call operator) is kernel, because there's no way to define it, or anything like it, in user code. Even in Racket, where #%app can be redefined, there is no way for user code to define such a construct.
lambda, named-lambda and case-lambda can express each other as macros. At least one of them needs to be predefined, but the choice is arbitrary. Since the compiler needs to know about that one, it's kernel and the other two are library.
~~Full call-with-current-continuation is kernel, not just a primitive, because it can't be implemented without changing how function calls are compiled.~~ Not really; see comments.
cons is a primitive in R5RS, because there's no way to define new datatypes. (Defining pairs as tagged vectors doesn't count, even if you redefine all vector operations to reject them, because existing code will still see them as vectors.) In R6RS, it can be library, because it could be defined with define-record-type. (However, it's often still primitive, for efficiency or for historical reasons.)
syntax-rules is mostly library, but requires some primitive support. define-syntax is kernel, but in a language with a hook to define new code-transformation passes in user code (kind of like Perl's source filters), it could be library.
The distinction between kernel and library has nothing to do with what names are standard or in the default namespace. map is standard and predefined but is pure library; if ~~define-values~~ set! is provided in a separate module, it's still kernel, because it requires compiler support.
~~Garbage collection is kernel, because it requires compiler (and runtime) support.~~ Not really; see comments.

Some things to say using these terms:

Library has to be implemented in terms of something, so primitives are necessary. There are often additional primitives for efficiency (especially in slow implementations), but most primitives add functionality though would be otherwise unavailable.
When two constructs can express each other, either one could be primitive; the choice is often arbitrary.
A foreign function interface can be seen as an endless source of primitives. This is one of the reasons FFIs are so important for new languages: they compensate for a shortage of primitives.
Library, unlike the other two parts, is potentially portable between implementations. If a language provides a good reference implementation of its library (as most SRFIs do), this makes it much easier to implement.
Complexity in the kernel is much more expensive than in primitives or library, for both implementors (because it complicates the compiler) and users (because they can't understand the kernel according to the same rules as user code). Primitives, however, aren't necessarily more expensive than library.

Expressive features are particularly interesting when they allow the language to express things in library that would otherwise have to be in the kernel. Such features can often pay for their complexity by simplifying the kernel. For example:

In most languages, all abstract syntax (and concrete syntax too, for that matter) is kernel. In languages with macros, most abstract syntax can be library, leaving only a few special forms in the kernel. In a fexpr language, even the special forms are mere primitives, because the compiler doesn't have to know anything about them. (It might still know, for efficiency, but it doesn't have to.) Kernel has a very small kernel.
Exposing compiler data structures such as environments allows user code to do things otherwise possible only in the kernel. In Scheme, there is no programmatic way to create definitions, so define and define-values are kernel. In Common Lisp, or in a Scheme extended with a define! procedure that creates top-level definitions, top-level define and define-values can be implemented as macros.
Optimization is usually kernel, but Common Lisp and GHC can express some optimizations in user code. (This is a powerful and underappreciated capability.)
ML's typechecking is kernel, because there's no way for user code to define it. In a language which allows user-defined program analyses (are there any?), it could be library. (This feature would fit nicely with user-defined optimizations.)

Caveats

There are really two orthogonal distinctions here: kernel vs. library is about whether the compiler needs to know about the feature; primitive vs. derived is about how it was implemented. However, derived kernel is usually impossible (what, are you redefining eval?), and the distinction from primitive kernel is not very interesting, so I don't bother.

The distinction between kernel and library is often more important than that between library and primitives. In these cases it's convenient to lump primitives and library together.

Some features require runtime but not compiler support, like fixnums or weak pointers or unboxed numeric arrays. These form a class (“runtime features”?) between primitives and kernel. I didn't include this class in the list above because I think it's less important, and can usually be lumped in with kernel or primitives.

I'm inconsistent about whether “primitive” refers only to features that must be built in, or to all those that are built in to a particular implementation.