Arcane Sentiment: Standard trajectories

History repeats itself — in programming languages too.

Andreas Rossberg pointed out a hysterical raisin in Java:

The reason Java generics turned out sub-par is mainly because they were added too late! They had to work around all kinds of language design and infrastructure legacy at that point. Compare with C#, where generics were put into the language early on and integrate much nicer. That is, generics should be designed into a language from the very beginning, and not as an afterthought.

John Nagle replied:

The other classic language design mistake is not having a Boolean type. One is usually retrofitted later, and the semantics are usually slightly off for reasons of backwards compatibility. This happened to Python, C, LISP, and FORTRAN.

These are two of the standard trajectories languages take through design space. Some things happen over and over:

Languages make more type distinctions over time. Booleans are a common case of this: if you have integers or nil or a defined/undefined distinction, separate Booleans are obviously unnecessary, so they're often left out of early versions of languages. Eventually the designers tire of wondering which ints are really integers and which are Booleans, and add a separate type.
Restrictions chafe and get relaxed. In particular, languages with restrictive static type systems usually add extensions to make them less restrictive: witness Haskell's numerous extensions for high-order polymorphism, dependent types, deferred type errors, and so on. Java's generics are a case of this — it got along without them for a while because it was possible to escape from the static type system by downcasting.
General cases supplant special cases. In particular, many languages start with a small closed set of datatypes, and add more very slowly. At some point they add user-defined types, and initially treat them as another, very flexible type. Once they get used to user-defined types, they realize all the others are unnecessary special cases which could be expressed as ordinary (i.e. user-defined) types. Python went through this (painfully, because its built-in types had special semantics); Scheme will soon, since R7RS is standardising define-record-type. (I'm not sure if the Schemers realize this; several WG1ers objected to a proposed record? operator on grounds of strong abstraction, but none mentioned that it should return true for all types, including built-in ones.)
Interfaces are regularized. It's easier to add a feature than to understand its relation to the rest of the language, so many features have ad-hoc interfaces, which are later replaced with uniform ones. This seems to happen to collection libraries a lot, although this might be simply because they're large and popular.
Newly fashionable features are added in haste and repented later. How many languages have poorly-designed bolted-on “object systems”? Sometimes parts of the new features are later extended to the rest of the language; sometimes they're made obsolete by better-designed replacements; sometimes they're abandoned as not useful.
Support for libraries and large programs is often added later. Newborn languages have few libraries and no large programs, so there's little need for features like modules or library fetching or separate compilation. When they're later needed (or perceived to be needed), they're added, often as kludges like Common Lisp's packages and C's header files, or even as external tools like C's linker (and makefiles) and Clojure's leiningen. This may be changing: libraries are now considered important enough that new languages usually have modules, at least.
And, of course, languages grow. It's much easier to add a feature than remove one.

Many of these trajectories are easy to trace, because they leave trails of obsolete features kept around for compatibility.

What other pieces of history repeat themselves?

5 comments:

John Cowan1 December 2012 at 17:24
R6RS also has define-record-type, though with a different syntax. However, while pairs can be simulated with a record, vectors cannot be. Even in Smalltalk there is a fundamental distinction between ordinary classes, classes with a variable number of slots, and classes with a variable number of arbitrary bytes, and Scheme provides only the first type.

What really sets Scheme off from most other dynamically typed languages is its comparative monomorphism. With the exception of exact/inexact number polymorphism, the arguments to every standard Scheme procedure are either monomorphic or universally polymorphic. As long as there is no subtyping other than the numbers, using systematic names more than makes up for this, providing some of the run-time benefits of static typing. Since R7RS-large will reintroduce subtyping with single inheritance of slots (like Common Lisp structs), it'll be interesting to see if the pressure to standardize generic functions begins to go up either in WG2 or in a future standardization effort.

It's OK to comment on old posts.