A brief history of “type”

The word “type” has a variety of meanings in programming languages, which are often a focus of confusion and contention. Here's a history of its use, focusing on particularly influential languages and papers.

1956: Fortran “modes”

The term “type” was apparently not yet established in 1956, because the Fortran manual speaks of integer and floating-point “modes” instead. It has something called “statement types”, but those are what are now called syntactic forms: assignment, conditional, do-loop, etc.

The 1963 Fortran II manual speaks of "two types of constants" (integer and floating-point), but this seems to be just the English word. When it talks about these types in more detail, it calls them “modes”, e.g. “arguments presented by the CALL statement must agree in number, order, mode, and array size with the corresponding arguments in the SUBROUTINE statement”. (Evidently the terms “formal” and “actual” parameters weren't established yet either.)

1958-63: Algol

Algol is one of the most influential languages in history. It introduced if ... then ... else, the int n declaration syntax, and semicolons. It also popularized the term “type”. The Algol 58 report defines type declarations on variables in terms of the “type” and “class” of values:

Type declarations serve to declare certain variables, or functions, to represent quantities of a given class, such as the class of integers or class of Boolean values. [...] Throughout the program, the variables, or functions named by the identifiers I, are constrained to refer only to quantities of the type indicated by the declarator.

The Algol 60 report is more consistent:

The various “types” (integer, real, Boolean) basically denote properties of values. The types associated with syntactic units refer to the values of these units.

Note that types are explicitly a property of values, not variables or expressions. But does “basically” mean someone thought otherwise, or just that this isn't a formal definition?

1967: Strachey's Fundamental Concepts

Chris Strachey's Fundamental Concepts in Programming Languages was an influential set of lecture notes that established a bunch of common terms. It defines types thus:

Most programming languages deal with more than one sort of object—for example with integers and floating point numbers and labels and procedures. We shall call each of these a different type and spend a little time examining the concept of type and trying to clarify it.

Strachey takes it for granted that types can be static or dynamic, and prefers static typing only for reasons of efficiency (which was, after all, of overwhelming importance in 1967):

It is natural to ask whether type is an attribute of an L-value or of an R-value—of a location or of its content. The answer to this question turns out to be a matter of language design, and the choice affects the amount of work, which can be done when a program is compiled as opposed to that which must be postponed until it is run.

Strachey does not mention type theory, because no one had yet realized that it could be applied to programs. That changed in the next year.

1968: type theory

James Morris was the first to apply type theory to programming languages, in his 1968 Lambda-calculus models of programming languages. “A system of types and type declarations is developed for the lambda-calculus and its semantic assumptions are identified. The system is shown to be adequate in the sense that it permits a preprocessor to check formulae prior to evaluation to prevent type errors.”

He begins by explaining what types are and why they matter, using the term in the usual programming-languages sense:

In general, the type system of a programming language calls for a partitioning of the universe of values presumed for the language. Each subset of this partition is called a type.

From a purely formal viewpoint, types constitute something of a complication. One would feel freer with a system in which there was only one type of object. Certain subclasses of the universe may have distinctive properties, but that does not necessiate an a priori classification into types. If types have no official status in a programming language, the user need not bother with declarations or type checking. To be sure, he must know what sorts of objects he is talking about, but it is unlikely that their critical properties can be summarized by a simple type system (e.g., prime numbers, ordered lists of numbers, ages, dates, etc.).

Nevertheless, there are good, pragmatic reasons for including a type system in the specifications of a language. The basic fact is that people believe in types. A number is a different kind of thing from a pair of numbers; notwithstanding the fact that pairs can be represented by numbers. It is unlikely that we would be interested in the second component of 3 or the square root of < 2,5 >. Given such predispositions of human language users, it behooves the language designer to incorporate distinctions between types into his language. Doing so permits an implementer of the language to choose different representations for different types of objects, taking advantage of the limited contexts in which they will be used.

Even though a type system is presumably derived from the natural prejudices of a general user community, there is no guarantee that the tenets of the type system will be natural to individual programmers. Therefore it is important that the type restrictions be simple to explain and learn. Furthermore, it is helpful if the processors of the language detect and report on violations of the type restrictions in programs submitted to them. This activity is called type-checking.

Then he switches without explanation to taking about static checkers, e.g:

We shall now introduce a type system which, in effect, singles out a decidable subset of those wfes that are safe; i.e., cannot given rise to ERRORs. This will disqualify certain wfes which do not, in fact, cause ERRORS and thus reduce the expressive power of the language.

So the confusion between programming-language and type-theory senses of the word began with the very first paper to use the latter.

1968: APL

APL-360 was the most popular dialect of APL. Its manual doesn't use the word “type”; it speaks of “representations” of numbers. But it considers these an implementation detail, not an important part of its semantics.

APL has a lot of unique terminology — monad and dyad for unary and binary operators, adverb and conjunction for high-order operators, and so on — so it's not surprising that it has its own word for types too.

1970: Pascal

Wirth's 1970 definition of Pascal is, as usual, plain-spoken: “The type of a variable essentially defines the set of values that may be assumed by that variable.” (But there's that “essentially”, like Algol's “basically”.)

1970-73: Lisp belatedly adopts the term

Like Fortran, early Lisps used the word “type”, but only in its ordinary English sense, never as a technical term. AIM-19, from 1960 or 1961, speaks of “each type of LISP quantity”, but doesn't use “type” unqualified. Similarly, the 1962 Lisp 1.5 Manual uses the word for various purposes, but not as an unqualified term for datatypes. The most common use is for function types (subr vs. fsubr); there are “types of variables” (normal, special, common), but datatypes were not, apparently, considered important enough to talk about. They might not have even been seen as a single concept — there are awkward phrases like “bits in the tag which specify that it is a number and what type it is”, which would be simpler with a concept of datatypes.

This changed in the early 1970s. The 1967 AIM-116a and 1970 AIM-190 still don't use “type”, but the 1973 Maclisp manual and 1974 Moonual do, and it consistently means “data type”. Most tellingly, they have typep, so the term was solidly ensconced in the name of a fundamental operator.

1973: Types are not (just) sets

By 1973, the definition of types as sets of values was standard enough that James Morris wrote a paper arguing against it: “Types are not sets”. Well, not just sets. He was talking about static typechecking, and argued that checking for abstraction-safety is an important use of static typechecking. The abstract explains:

The title is not a statement of fact, of course, but an opinion about how language designers should think about types. There has been a natural tendency to look to mathematics for a consistent, precise notion of what types are. The point of view there is extensional: a type is a subset of the universe of values. While this approach may have served its purpose quite adequately in mathematics, defining programming language types in this way ignores some vital ideas. Some interesting developments following the extensional approach are the ALGOL-68 type system, Scott's theory, and Reynolds' system. While each of these lend valuable insight to programming languages, I feel they miss an important aspect of types. Rather than worry about what types are I shall focus on the role of type checking. Type checking seems to serve two distinct purposes: authentication and secrecy. Both are useful when a programmer undertakes to implement a class of abstract objects to be used by many other programmers. He usually proceeds by choosing a representation for the objects in terms of other objects and then writes the required operations to manipulate them.

1977: ML and modern static typing

ML acquired its type system in about 1975 and was published in 1977. Until this point, the application of type theory to programming languages had been theoretical, and therefore had little influence. ML made it practical, which has probably contributed a lot to the terminological confusion.

ML's theoretical support (along with the misleading slogan “well-typed expressions do not go wrong”) came out in the 1978 paper A Theory of Type Polymorphism in Programming, which despite being about type theory, speaks of types containing values:

Some values have many types, and some have no type at all. In fact “wrong” has no type. But if a functional value has a type, then as long as it is applied to the right kind (type) of argument it will produce the right kind (type) of result—which cannot be “wrong”!

Now we wish to be able to show that—roughly speaking—an Exp expression evaluates (in an appropriate environment) to a value which has a type, and so cannot be wrong. In fact, we can give a sufficient syntactic condition that an expression has this robust quality; the condition is just that the expression has a “well-typing” with respect to the environment, which means that we can assign types to it and all its subexpressions in a way which satisfies certain laws.

The short version

So here's the very brief history of “type”:

  1. It wasn't used at all until 1958.
  2. Types as sets of values: Algol-58.
  3. The type-theory sense: Morris 1968.

These may not be the earliest uses. I got most of the old manuals from Paul McJones' collection, which is a good place to look for more. I welcome antedatings.

I'm also curious about the term “datatype”, which might plausibly be ancestral to “type”. I could find no uses of it older than “type”, but I may be looking in the wrong field. Statistical data processing is much older than computing, and has dealt with datatypes for a long time. Might the terms “datatype” and “type” have originated there?

Incorrect optimization in 1963

Floating-point users today are accustomed (or resigned, sometimes) to compilers that make invalid optimizations by assuming all arithmetic is mathematically correct instead of rounding. The situation used to be worse. A 1963 IBM Fortran II manual warns that it did this for integers too:

FORTRAN assumes that mathematically equivalent expressions are computationally equivalent. Hence, a sequence of consecutive multiplications, consecutive divisions, consecutive additions, or consecutive subtractions, not grouped by parentheses will be reordered, if necessary, to minimize the number of storage accesses in the object program.

Although the assumption concerning mathematical and computational equivalence is virtually true for floating point expressions, special care must be taken to indicate the order of fixed point multiplication and division, since fixed point arithmetic in FORTRAN is “greatest integer” arithmetic (i.e., truncated or remainderless). Thus, the expression

5*4/2

which by convention is taken to mean [(5 × 4)/2], is computed in a FORTRAN object program as

((5/2)*4

i.e., it is computed from left to right after permutation of the operands to minimize storage accesses.

The result of a FORTRAN computation in this case would be 8. On the other hand, the result of the expression (5 × 4)/2 is 10. Therefore, to insure accuracy of fixed point multiplication and division, it is suggested that parentheses be inserted into the expression involved.

(Reordering “to minimize the number of storage accesses” is pointless in a constant expression, but apparently the optimizer did it anyway.)

If this reordering can be prevented by redundant parentheses, then parentheses don't only affect parsing; they change semantics by introducing a barrier against algebraic transformations!

Giving parentheses this additional meaning has an unfortunate effect: other optimizations can no longer ignore them. The manual continues by describing one such problem:

One important type of optimization, involving common subexpressions, takes place only if the expression is suitably written. For example, the arithmetic statement

Y = A*B*C + SINF (A*B)

will cause the object program to compute the product A*B twice. An efficient object program would compute the product A*B only once. The statement is correctly written

Y = (A*B) * C + SINF (A*B)

By parenthesizing the common subexpression, A*B will be computed only once in the object program.

In general, when common subexpressions occur within a expression, they should be parenthesized.

There is one case in which it is not necessary to write the parentheses, because FORTRAN will assume them to be present. These are the type discussed in “Hierarchy of operations,” and need not be given. Thus

Y = A*B+C+SINF (A*B)

is, for optimization purposes, as suitable as

Y = (A*B)+C+SINF (A*B)

I'm not sure whether the problem is simply that A*B*C does not contain the subexpression A*B, or that the CSE lifter sees it but can't merge it with (A*B) because they're not equivalent in all contexts.

Optimizers today still have limitations, and still make invalid transformations, but they've become much more subtle!

Errors are not the same as incorrectness

Program checkers, if they are to check objective properties rather than the prejudices of their authors, must ground their judgements in some aspect of programs' behavior. (Or in their maintainers' behavior, but that's much harder to prove anything about.) Usually the property they check is whether the program will have errors at runtime. If it will fail dynamically, then the checker judges it a bad program statically.

This is an obvious premise, and it's the standard justification for all sorts of program checking, but it's not necessarily true, as Andreas Rossberg points out:

Take the following degenerate program for computing travel routes:

ComputeAndDisplayTravelRoute(inputs);
"boo" - 1;

This will throw a type error on the second line, and a tool like Dialyzer would (correctly) diagnose that (it's obviously trivial in this case). However, before this error is raised, the program actually successfully completes its designated job, namely computing a travel route and displaying it to the user. Yet such a program is defined as "invalid". I'm asking why.

Crashing on exit is a fairly common problem. (Games seem particularly prone to this, perhaps because graphics has so much hardware-dependent setup and teardown.) It doesn't usually cause any problem for the user, so it's not a high priority to fix. But the usual standard of program checking considers it unforgiveable.

Programs that produce errors (of any kind, not just type errors) are usually much worse than programs without. But not always. The properties we check are only an approximation to the ones we care about.

If Scheme were like Scheme

Scheme's numbers are not like the rest of its library. They're older, and they're mostly borrowed from other languages (Maclisp and Common Lisp), so they follow those languages' style rather than Scheme's. They're designed more for the convenience of users than of theorists; they have a usefully complete feature set; they have a printed representation; their operations are predefined and polymorphic and have very short names.

What would Scheme be like if numbers followed the same style as the rest of the language?

It would be necessary to import a library before using any numbers.

(import (scheme numbers))

Numeric constants would be provided as functions returning the constant, apparently because the section of RNRS they appear in is called “Standard Procedures”. Only the most basic constants would be provided; pi would not be among them.

(define (exact-rational-zero)
  (make-exact-rational (exact-integer-zero) (exact-integer-one)))

Numbers would have no printed representation. Creating them would require explicit constructor calls.

There would be no polymorphism. Most operations would include a type in their name.

(define (factorial n)
  (if (exact-integer<=? n (exact-integer-one))
    (exact-integer-one)
    (exact-integer-multiply! (factorial (exact-integer-subtract n (exact-integer-one))) n)))

The distinction between exact and inexact numbers would still be supposedly “orthogonal to the dimension of type”. But the lack of polymorphism would make it even more obvious that in practice exactness was simply one of the type distinctions: that between floats and everything else.

Floating-point numbers would be called “inexact rationals”. Their constructor would take a numerator and denominator, just like exact rationals; their floating-point representation would be considered an implementation detail. Various details of the specification would be inconsistent with IEEE floating point.

NaN would not be a number, of course. inf.0 and -inf.0 would be exact transfinite numbers, not inexact rationals. There would be no negative zero.

Names would be descriptive, like inexact-rational-square-root and exact-integer-greatest-common-divisor.

There would be exact-integer->list and list->exact-integer operations to convert to and from lists of digits (in arbitrary bases). Converting the lists into strings would be up to you. Converting anything other than exact integers to strings would also be up to you.

Numbers would be portably mutable. Some operations would have destructive versions. (If we did this exercise on Python, some would have only destructive versions.) Racket would omit these, supposedly to make optimization easier, but would have separate mutable numbers for programs that need them.

Operations more obscure than exponent would be left to SRFIs. Users would be able to choose between the widely supported SRFI and the complete SRFI.

exact-integer-divide would not be provided, on the grounds that it's not defined for all integers, and can't be implemented efficiently without special hardware.

There would be a portable way to use exact integers as indexes into lists, but not into vectors or strings. This would be remedied in R7RS.

Some implementations would support surprisingly obscure and practical floating-point operations, while omitting basic operations their authors never needed.

(define (numerically-stable? thunk tolerance)
  "Run a floating-point computation with various rounding modes to see
if this significantly changes the result. This is not a reliable test
of numeric stability, but it's an easy way to find bugs."
  (let ((down (call-with-rounding-mode round-down thunk))
        (up (call-with-rounding-mode round-up thunk))
        (nearest (call-with-rounding-mode round-to-nearest thunk))
        (zero (call-with-rounding-mode round-to-zero thunk))
        (roughly-equal? (lambda (a b)
                         (inexact-rational<=?
                          (inexact-rational-absolute-value
                           (inexact-rational-subtract a b))
                          tolerance)))))
    (and (roughly-equal? down up)
         (roughly-equal? down nearest)
         (roughly-equal? down zero)
         (roughly-equal? up nearest)
         (roughly-equal? up zero)
         (roughly-equal? nearest zero)))

There would be debates about whether eq? should “work” on numbers. This would really be about whether numeric operations should always return fresh numbers, and whether the compiler would be allowed to copy them, but no one would mention these merely implementational issues.

eqv? and equal? would compare numbers, even immutable ones, by identity. Hashtables would — OK, standard Scheme doesn't have hashtables. But if it did, the default hash function would hash numbers by identity, not by value.

Arithmetic overflow would still be “a violation of an implementation restriction”. There would still be no way to find out how large a number could safely be.

There would still be no bitwise operations on integers. Schemers who understood the purpose would advise using an implementation that supports bitvectors instead of abusing numbers. Those who did not would say they're easy to implement.

(define two (exact-integer-add (exact-integer-one) (exact-integer-one)))
(define (exact-integer-bitwise-and a b)
  (list->exact-integer (map exact-integer-minimum
                            (exact-integer->list a two)
                            (exact-integer->list b two))))

Complex numbers would, mercifully, be left to a SRFI. The SRFI number would be real, but in most implementations complex-number support would be purely imaginary.

All the comparison predicates would end in ?.

Edit: Replaced some stray uses of <= and + and min with their counterfactual-Scheme equivalents.

In the HN comments, cousin_it says:

We can see similar examples in other languages, e.g. C++ strings are "like C++" and a pain to use, while Java strings are "not like Java" and a pleasure to use. Maybe language design really isn't about general-purpose elegance, but about finding good special-purpose solutions.

Or about using the good general-purpose solutions you already have.

Parentheses are more annoying in infix

There's a lot of code in functional languages written with a C or Java accent. The reverse is much rarer, but I have seen some: C++ written with a Lisp accent.

I didn't like it.

I didn't like the fooP convention for predicates. I didn't like the large multi-line expressions. And I especially didn't like the redundant parentheses.

What? A lisper doesn't like parentheses?

Parens are not high on the list of things that bother me in Lisp. They're only a little verbose, only a little distracting, only a little trouble to match. Large expressions don't bother me either; they're clearer than the alternative. And I like foo-p, because it's short and pronounceable.

Was I just objecting to C++ that didn't look like C++? Was I offended by contact between pretty Lisp and icky C++?

For fooP, that's probably the whole of it. It's camelCase instead of hyphenated, so it looks wrong as Lisp, and it's not standard C++ style, so it looks wrong as C++. And I'd rather not have to explain to other C++ programmers why I'm using a convention from some weird academic language. But I don't have a substantive objection.

For the other two features, I do.

Large expressions in prefix notation are easy to parse. The root operator is plainly visible at the beginning, and indentation goes a long way toward making the structure clear. Large expressions in infix are not so easy. The root operator is buried somewhere in the middle, and one must parse much of the expression to find it. There's no easy way to indent infix expressions, so breaking an expression across multiple lines doesn't alleviate much of the parsing load. This is why programmers in infix languages usually prefer to break such expressions into multiple statements.

Parentheses in Lisp are consistent: they all delimit lists, and almost all delimit forms. The semantics of the forms may be arbitrarily variable, but those of the parens are always the same. In C++, however, parentheses have several different meanings. They sometimes override precedence, sometimes call (or declare) functions, sometimes do typecasts, and sometimes delimit conditions in control structures. So a nest of parentheses in C++ is much more ambiguous than in Lisp, and it takes more parsing effort to determine which ones are which.

This goes some way toward explaining why so many programmers are suspicious of Lisp's syntax. Large expressions and nests of parentheses are suspicious in infix languages, and this suspicion does not instantly vanish in a new language.

Effects vs. side effects

Commonly used terms get abbreviated. Thus functional programmers often say “effect” instead of “side effect”. I approve of this usage – not only because it's shorter, but because it frees up “side effect” for another concept. This is something assembly language programmers know, and have known for decades, that other programmers seldom speak of.

Most machines have no notion of a return value; the only way for parts of a program to communicate is by mutating registers. So assembly language programs must do all their communication by effect. This means they distinguish between different kinds of effect. In particular, they distinguish effects that are part of a routine's contract from those that, however consistent, are not intentional: side effects.

Consider this implementation of factorial on a typical register machine:

;The factorial function, iteratively
;args: r1 = n
;results: r2 = n!
;All other registers are preserved.
factorial:
  li r2, 1
loop:
  cmpi r1, 1
  ble done
  mul r2, r2, r1
  sub r1, r1, 1
  b loop
done:
  ret

This function leaves its result in r2, but also happens to set r1 to 1. This is a side effect: an effect not in the routine's contract. It is, of course, a bad idea to rely on these, but by accident or desperation, assembly programmers occasionally do, which is why they have a name for them.

(Recursive factorial is more complex than iterative on most machines – often absurdly so, if you strictly follow an ABI that wants you to save registers and construct stack frames. This is one of the reasons programmers accustomed to low-level languages don't take readily to recursion. To them, it looks unnecessarily complex, because it is complex in implementation. High-level languages hide this complexity, but low-level programmers know it's still there.)

It's not normal for programs in higher-level languages to have side effects in this sense, because they have fewer ways to accidentally have effects. Supposedly unobservable effects like preloading caches are common (and are occasionally relied on), but typically any observable effect that isn't part of the interface is a bug. So this concept is less useful in higher-level languages. The more general concept of relying on unspecified behaviour remains useful, though, and it's quite familiar from discussions of language specs.

Functional programming advocacy suffers from a focus on purity, where state is considered a sin to be avoided absolutely. One way the movement might make progress is to distinguish between different kinds of effects, so they could say which ones are deadly and which are venial, rather than treating all effects as indistinguishable evil. Vocabulary analogous to the assembly language programmers' “side effect” might help with this.

Customary semantics

What is the real, definitive semantics of a language? There are three standard answers:

  1. The natural-language specification, because it's the one the designers understand.
  2. The reference implementation, because it's unambiguous and well-tested.
  3. The formal semantics (of whichever flavor), because it avoids implementation concerns, so it's simpler than a real implementation. (Or because it's difficult and therefore “rigorous”.)

There's a controversial fourth option: the definitive semantics of a language is the behavior that is consistent across all conventional implementations.

This approach has some virtues:

  • It identifies the behavior you can rely on. Implementations have bugs and deliberate deviations from the spec, where you can't rely on the specified behaviour. They also have widely supported extensions which you can rely on, even though they're not in the spec.
  • Unlike any other means of defining semantics, implementations are heavily tested. Formal semantics can be tested by turning them into implementations, but seldom are; natural-language specifications aren't mechanically tested at all.
  • It's reconstructable. Users can always find out what their implementations do, even when the spec is not publicly available, or is difficult to read. (Most specs are.) Sometimes this shows them implementation-dependendent behavior, but by comparing implementations they can discover the customary semantics.

Deferring to custom is unpopular among language designers and theorists. We see it as an ill-defined, unstable foundation about which nothing can be known with confidence, and on which nothing can be built reliably. We remember the chaos that engulfed HTML and CSS and Javascript when their users treated buggy implementations as specs, and we don't want it to happen again. We want our semantic questions to have authoritative answers, and mere custom does not provide that.

But it's the de facto standard among users of languages. Most programmers are not language lawyers, and can't readily figure out whether the spec says their code will work. But they can easily try it and see what happens.

We can tell users not to do this. We can tell them to avoid empiricism, to seek authority rather than evidence, to shut their lying eyes and trust in doctrine. This is not good advice in most areas, not even in other areas of programming, nor for semantics of other languages natural or artificial. Is it really good advice for programming languages?

Whether it's good advice or bad, users don't listen. Their models are based on the behaviour they observe. As a result, many popular “myths” about languages — that is, widely held beliefs that are officially supposed to be false — are true in the customary semantics. For example, here are some parts of C's customary semantics that are not part of the formal specification. Some of them are violated on unusual architectures, but most C users have never written for such an architecture, so custom doesn't care.

  • Signed integers are represented in two's complement. (Rumor has it this is not quite always true.)
  • Signed integer overflow is modulo word size, like unsigned.
  • All pointer types have the same representation: an integer.
  • NULL is represented as 0.
  • Memory is flat: it's all accessible by pointer arithmetic from any pointer.
  • Pointer arithmetic is always defined, even outside array bounds. Overflow is modulo word size, just like integers.
  • Dereferencing an invalid pointer, such as NULL or an out-of-bounds pointer, blindly tries to use the address.
  • Compilers generate native code. The built-in operators compile to machine instructions.
  • char is exactly eight bits wide.
  • Characters are represented in a superset of ASCII.

(I thought sizeof(char) == 1 was only in the customary semantics, but it's actually in the spec.)

Much of the furor over optimizations that exploit undefined behaviour is because they're invalid in the customary semantics. Some C compiler maintainers have come to believe that the spec is the whole of the contract between compilers and users, and thus that users don't care about semantics not defined therein. It's a convenient belief, since it permits optimizations that would otherwise be impossible, but it's wildly at odds with what their users want. This isn't the only problem with these optimizations — they make for perverse error behaviour under any semantics — but this is why users tend to see them as not merely bad but incorrect.

Language lawyers, especially those who write specs, should take customary semantics more seriously, so they don't contradict the semantics in actual use.