Number syntax extensions

I approve of James Hague's proposal to allow numbers with k as a suffix, so 64k is 65536. However, there is that binary vs. decimal issue.

The IEC/ISO binary prefixes (ki, Mi, Gi, etc.) have not been widely adopted, and for good reason: unlike the regular SI prefixes, they don't represent anything in spoken language. They come with proposed pronunciations, but these are too awkward (especially when followed by ‘byte’) to be accepted — ‘kibibyte’ sounds like a mockery of the SI prefixes, not an addition to them. As long as there are no binary prefixes in speech, there will be little demand for abbreviations that aren't short for anything.

There is (was?) another proposed set of binary prefixes which suffer this problem slightly less: bk, bM, bG, etc. They resemble the spoken ‘binary kilo-’, ‘binary mega-’, etc., so they might have some chance of adoption. Maybe programming languages should use them, so 1bk = 1024 and 1k = 1000. Or maybe they should unapologetically use 1k = 1024 — who needs powers of 10 anyway?

Some other, less controversial extensions to number syntax:

  • Allow flonums to omit zeroes before or after the decimal point: 1. and .1 should be acceptable as floats. (In Common Lisp, 1. is an integer, for historical reasons.)
  • Separators, as in Ada and Ruby and rather a lot of languages nowadays: 109 can be written as 1_000_000_000. (Or maybe as 1,000,000,000, but I'd rather avoid comma, to avoid confusion over whether a string like 1,234.5 is 1234.5 or 1.2345.) This is especially useful in REPL output, so you don't have to read unpunctuated ten-digit numbers. (Usability hint: don't print separators for four- or five-digit numbers; they're not needed there.)
  • Exponents in integers: 1e9 should read as the integer 1000000000, not as a float. (This is a good substitute for decimal SI suffixes.) Negative suffixes could read as ratios if available, or else as flonums.
  • Fractional SI suffixes: if 1k is 1024 or 1000, might 1m be 1/1000 (or 1/1024)?
  • Radix with ‘r’: 2r1101011 instead of #2r1101011, saving a dispatching character. (There is virtually no demand for radices over 16, so don't bother supporting them. If you want base 36, you'd probably like base 64 even better.)
  • Scheme-style complex numbers (-2+3i, 3.033-198.26i) may be easier to read than CL-style ones, although this seldom matters.

I suppose none of this is very important.

2 comments:

  1. I've always hated "1." as a float notation; the difference from "1" is too subtle. I'm unconvinced about "1e9"; if you want that to be exact, go for "#e1e9".

    ReplyDelete
  2. Nice idea. What about using 10k to mean 10 kilobites and 10K to mean 10.000 ? Like that there's no confusion and it can be easily implemented (at least in some languages ... :)

    ReplyDelete

It's OK to comment on old posts.