Effects vs. side effects

Commonly used terms get abbreviated. Thus functional programmers often say “effect” instead of “side effect”. I approve of this usage – not only because it's shorter, but because it frees up “side effect” for another concept. This is something assembly language programmers know, and have known for decades, that other programmers seldom speak of.

Most machines have no notion of a return value; the only way for parts of a program to communicate is by mutating registers. So assembly language programs must do all their communication by effect. This means they distinguish between different kinds of effect. In particular, they distinguish effects that are part of a routine's contract from those that, however consistent, are not intentional: side effects.

Consider this implementation of factorial on a typical register machine:

;The factorial function, iteratively
;args: r1 = n
;results: r2 = n!
;All other registers are preserved.
factorial:
  li r2, 1
loop:
  cmpi r1, 1
  ble done
  mul r2, r2, r1
  sub r1, r1, 1
  b loop
done:
  ret

This function leaves its result in r2, but also happens to set r1 to 1. This is a side effect: an effect not in the routine's contract. It is, of course, a bad idea to rely on these, but by accident or desperation, assembly programmers occasionally do, which is why they have a name for them.

(Recursive factorial is more complex than iterative on most machines – often absurdly so, if you strictly follow an ABI that wants you to save registers and construct stack frames. This is one of the reasons programmers accustomed to low-level languages don't take readily to recursion. To them, it looks unnecessarily complex, because it is complex in implementation. High-level languages hide this complexity, but low-level programmers know it's still there.)

It's not normal for programs in higher-level languages to have side effects in this sense, because they have fewer ways to accidentally have effects. Supposedly unobservable effects like preloading caches are common (and are occasionally relied on), but typically any observable effect that isn't part of the interface is a bug. So this concept is less useful in higher-level languages. The more general concept of relying on unspecified behaviour remains useful, though, and it's quite familiar from discussions of language specs.

Functional programming advocacy suffers from a focus on purity, where state is considered a sin to be avoided absolutely. One way the movement might make progress is to distinguish between different kinds of effects, so they could say which ones are deadly and which are venial, rather than treating all effects as indistinguishable evil. Vocabulary analogous to the assembly language programmers' “side effect” might help with this.

Customary semantics

What is the real, definitive semantics of a language? There are three standard answers:

  1. The natural-language specification, because it's the one the designers understand.
  2. The reference implementation, because it's unambiguous and well-tested.
  3. The formal semantics (of whichever flavor), because it avoids implementation concerns, so it's simpler than a real implementation. (Or because it's difficult and therefore “rigorous”.)

There's a controversial fourth option: the definitive semantics of a language is the behavior that is consistent across all conventional implementations.

This approach has some virtues:

  • It identifies the behavior you can rely on. Implementations have bugs and deliberate deviations from the spec, where you can't rely on the specified behaviour. They also have widely supported extensions which you can rely on, even though they're not in the spec.
  • Unlike any other means of defining semantics, implementations are heavily tested. Formal semantics can be tested by turning them into implementations, but seldom are; natural-language specifications aren't mechanically tested at all.
  • It's reconstructable. Users can always find out what their implementations do, even when the spec is not publicly available, or is difficult to read. (Most specs are.) Sometimes this shows them implementation-dependendent behavior, but by comparing implementations they can discover the customary semantics.

Deferring to custom is unpopular among language designers and theorists. We see it as an ill-defined, unstable foundation about which nothing can be known with confidence, and on which nothing can be built reliably. We remember the chaos that engulfed HTML and CSS and Javascript when their users treated buggy implementations as specs, and we don't want it to happen again. We want our semantic questions to have authoritative answers, and mere custom does not provide that.

But it's the de facto standard among users of languages. Most programmers are not language lawyers, and can't readily figure out whether the spec says their code will work. But they can easily try it and see what happens.

We can tell users not to do this. We can tell them to avoid empiricism, to seek authority rather than evidence, to shut their lying eyes and trust in doctrine. This is not good advice in most areas, not even in other areas of programming, nor for semantics of other languages natural or artificial. Is it really good advice for programming languages?

Whether it's good advice or bad, users don't listen. Their models are based on the behaviour they observe. As a result, many popular “myths” about languages — that is, widely held beliefs that are officially supposed to be false — are true in the customary semantics. For example, here are some parts of C's customary semantics that are not part of the formal specification. Some of them are violated on unusual architectures, but most C users have never written for such an architecture, so custom doesn't care.

  • Signed integers are represented in two's complement. (Rumor has it this is not quite always true.)
  • Signed integer overflow is modulo word size, like unsigned.
  • All pointer types have the same representation: an integer.
  • NULL is represented as 0.
  • Memory is flat: it's all accessible by pointer arithmetic from any pointer.
  • Pointer arithmetic is always defined, even outside array bounds. Overflow is modulo word size, just like integers.
  • Dereferencing an invalid pointer, such as NULL or an out-of-bounds pointer, blindly tries to use the address.
  • Compilers generate native code. The built-in operators compile to machine instructions.
  • char is exactly eight bits wide.
  • Characters are represented in a superset of ASCII.

(I thought sizeof(char) == 1 was only in the customary semantics, but it's actually in the spec.)

Much of the furor over optimizations that exploit undefined behaviour is because they're invalid in the customary semantics. Some C compiler maintainers have come to believe that the spec is the whole of the contract between compilers and users, and thus that users don't care about semantics not defined therein. It's a convenient belief, since it permits optimizations that would otherwise be impossible, but it's wildly at odds with what their users want. This isn't the only problem with these optimizations — they make for perverse error behaviour under any semantics — but this is why users tend to see them as not merely bad but incorrect.

Language lawyers, especially those who write specs, should take customary semantics more seriously, so they don't contradict the semantics in actual use.

Why is breadth-first numbering hard?

John Launchbury gave Chris Okasaki an annoying puzzle:

Given a tree T, create a new tree of the same shape, but with the values at the nodes replaced by the numbers 1 .. |T| in breadth-first order.

Go ahead and solve it. I'll wait.

If you want to solve it functionally, I'll wait longer.

Chris posed this puzzle to many functional programmers, and found that they had a surprisingly hard time with it. They took a long time to solve it, and their solutions were seldom elegant. He came up with various hypotheses as to why: did the programmers not know breadth-first traversal or queues? Did they prematurely commit to lists or pattern matching? He didn't seem to find any of them convincing. Neither do I.

One hypothesis he didn't mention is that most functional programmers see a recursive data structure and immediately try to process it by straighforward structural recursion, with a call tree isomorphic to the data structure. When you have many tools, and you encounter a nail, you reach for your hammer, right? But in this case structural recursion is the wrong tool, and it takes a while for programmers to backtrack far enough to notice.

It may take even longer for them to identify the right tool. Queues, like hashtables, are a little awkward for functional programmers, because their most natural implementations are stateful, as are many of their applications. They're almost always used linearly (i.e. there's only one version of the queue at a time), so eschewing state buys no useful flexibility, and incurs the extra hassle of explicitly passing the updated queue around. It also prevents using the efficient circular-buffer representation, just as it usually prevents using hashtables.

They're also a little awkward to use in functional languages, because none of the most familiar and widely implemented functional data structures (lists, tree dictionaries, tree sets, tries) is easily used as a queue, so would-be queue users must look up a queue library, or build one, or use pairs of lists (if they know this trick), or use some inappropriate data structure, or give up and use some other algorithm. Which is what most of Chris's subjects did.

Meanwhile, Java users use its ordinary LinkedList class (which is a doubly-linked list, and thus a reasonably efficient deque) to win contests without having to worry about any of this. Can your functional language do as well?