Identity of numbers in Common Lisp and Maclisp

Nick Levine asks:

If anyone knows of a Common Lisp implementation for which (eq 3 3) need not be true, could they get back to me? I know what the Spec says about this but have difficulty believing it.

Commenters duly reply that ABCL and some versions of GCL do sometimes have multiple fixnums with the same value. Basically, this can happen in any implementation that uses boxed fixnums and doesn't intern them.

But this is not the only thing that can happen to fixnum identity. Common Lisp allows much worse:

An implementation is permitted to make "copies" of characters and numbers at any time.

This is an ugly thing to have in a language with otherwise straightforward object semantics. CLtL explains:

In general, numbers in Common Lisp are not true objects; eq cannot be counted upon to operate on them reliably. In particular, it is possible that the expression

(let ((x z) (y z)) (eq x y))

may be false rather than true if the value of z is a number.

Rationale: This odd breakdown of eq in the case of numbers allows the implementor enough design freedom to produce exceptionally efficient numerical code on conventional architectures. MacLisp requires this freedom, for example, in order to produce compiled numerical code equal in speed to Fortran. Common Lisp makes this same restriction, if not for this freedom, then at least for the sake of compatibility.

If I understand correctly, the rationale refers to how PDP-10 Maclisp handled arithmetic. It used boxed numbers, and to reduce allocation it often kept them on temporary stacks. This created a problem when a stack-allocated number had to escape and become a permanent value:

Some operations, such as storing a number in a special variable, or RPLAC'ing it into a cons, clearly require permanent allocation of the number. However, in order to allow allocation-free numeric computation, PDP-10 Lisp allows the passing of "intermediate" numeric results between functions without allocating them. Because of this, a function receiving any argument which cannot be positively verified at compile time as not being a pointer to a "temporary" number ("PDL number") must be extremely circumspect about what it does with that argument, in fact, calling a runtime routine named PDLNMK whenever a result of "unknown parentage" is stored into a permanent place. The routine checks to see if the result being stored is in fact a pointer to a PDL number, and allocates it permanently if so.

Steele's Fast Arithmetic in MacLISP provides more detail: NCOMPLR reallocated numbers whenever they escaped and had to become permanent, so it sometimes allocated a single value more than once, producing multiple non-eq copies.

The same problem arises with deboxing optimizations in other compilers. Most deboxing compilers, however, are careful to avoid multiple reboxing, for performance reasons: they don't want to embarrass themselves by accidentally allocating even more numbers than they would with naive boxed arithmetic. This has the convenient side effect of preserving identity of numbers.

So while Common Lisp allows implementations to copy numbers at will, I suspect this is purely a historical relic, “rooted in NCOMPLR” rather than in any practical need. Has there ever been an implementation that exploited it?

1 comment:

  1. ABCL does that because it's on the JVM and doesn't do type analysis, so boxed integers are natural. The following Scheme implementations are also fully boxed: SISC (JVM), KSi, TinyScheme, Scheme 9, Dream, BDC, UMB. I suspect that NexJ (JVM), XLisp, and Rep have the same story, but I can't easily prove it. See FixnumInfo for details.

    Kawa (JVM) and IronScheme (CLR), like Jython, box everything but have preallocated boxes for a small range of integers: -100 to 1024 for Kawa, -100 to 999 for IronScheme.

    ReplyDelete

It's OK to comment on old posts.