Arcane Sentiment: Unboxed arrays break identity

Common Lisp explicitly allows its implementations to copy numbers whenever they feel like it, so object identity is not reliable. Previously I said this was a relic of Maclisp, but I overlooked a simple, obvious stronger reason: unboxed arrays. Long ago on RRRS-authors, Pavel Curtis gave another example where numbers might be copied:

(let ((v (make-vector 1 3.0)))
      (eq? (vector-ref v 0) (vector-ref v 0)))

This returns true in any ordinary Scheme, because storing a number into a vector does not copy it. However, if v is an unboxed vector of floats, this will probably return false, because the number naturally gets boxed twice. It does in Racket:

> (require racket/flonum)
> (let ((v (make-flvector 1 3.0)))
    (eq? (flvector-ref v 0) (flvector-ref v 0)))
#f

And SBCL:

CL-USER> (make-array '() :element-type 'single-float :initial-element 3.0)
#0A3.0
CL-USER> (eq (aref *) (aref *))
NIL

(That's a zero-dimensional array, with one element.)

Clojure doesn't explicitly allow copying of numbers, but does it anyway, of course:

user> (let [x 1.0 v [x]] (identical? (v 0) (v 0)))
true
user> (let [x 1.0 a (double-array [x])] (identical? (get a 0) (get a 0)))
false
user> (let [x 1.0 a (object-array [x])] (identical? (get a 0) (get a 0)))
true

It doesn't even require an array, since it sometimes unboxes ordinary variables without preventing multiple reboxing:

user> (let [x 1.0] (identical? x x))
false
user> (let [x (if true 1.0 1)] (identical? x x))
true

Scala hides the issue by making eq unavailable on potentially unboxed types like Float (and therefore on Any, which might be annoying):

scala> 1.0 eq 1.0
<console>:7: error: value eq is not a member of Double
       1.0 eq 1.0
       ^

Any language that boxes floats but wants efficient numerics practically has to support unboxed numeric vectors, and therefore allow implicit copying of numbers, since preventing it requires (undecidable) nonlocal analysis. So its spec must provide some permission to copy numbers — or any boxed type with an unboxed container; it's not specific to numbers. This permission need not be a blanket license to copy, though; it could be restricted to specialized arrays. Or, in order to permit unboxing variables without forcing the compiler to be paranoid about multiple reboxing, it could be permitted for a conservative approximation of "potentially unboxed numbers", e.g. those in local variables statically known to be numbers of a specific type, whose values come from unboxable operations (those that compute new numbers: sin, not car).

Does this make NaNboxing sound more attractive?

6 comments:

John Cowan14 September 2013 at 01:38
What it means, as far as I am concerned, is that Scheme's object identity predicate is eqv? (CL's EQL) and not eq?. Indeed, ever since R4RS there has been language in the standard similar to "An object fetched from a location, by a variable reference or by a procedure such as car, vector-ref, or string-ref, is equivalent in the sense of eqv? to the object last stored in the location before the fetch" (R7RS section 3.4, Storage Model). So in principle even pairs can return numbers that are not eqv? to the ones they were set up to contain.

Things are further complicated by the special cases of NaN and procedures.
Arcane Sentiment19 September 2013 at 21:08
So Scheme also has blanket permission to copy, but by omission rather than explicitly. This is not a great way to specify this sort of thing, both because it's easy to miss (as I did), and because it's hard to tell whether the permission is deliberate.

eq? matters because it's implementational identity: it shows what's actually happening, not just what's guaranteed. I'm trying to do a post on why this is important, but I keep confusing implementational/operational behavior with customary behavior and with explicit vs. implicit specification (as in the previous paragraph). These are three different issues that happen to coincide here.
Unknown12 September 2021 at 10:58
Tried the example with SBCL and at least with SBCL 2.1.8 it actually gives T.

It's OK to comment on old posts.