Exception logic

Programming generally requires Boolean logic, so every practical language supports it with if, and, not, and other such operators. In almost all languages, these operators receive their Boolean inputs as ordinary values. (if (and a b) c d) evaluates a, and the resulting value determines whether to evaluate b, and so on. This is all so obvious and familiar that it's hard to imagine it could be otherwise.

Icon does it differently. Icon expressions (and functions) have two continuations: they can succeed and produce a value, just like expressions in a thousand other languages, or they can fail. This feature has many uses, and one of them is to represent Booleans: Icon encodes them not in values but in success or failure. Its if operator doesn't look at the value returned by the conditional expression — it only looks at whether it succeeds or fails. Similarly, its or operator (called |) evaluates its second argument only if the first fails. In Icon, Booleans aren't values.

Few languages have Icon's success/failure semantics, but it doesn't take exotic semantics to encode Booleans this way. Any language with exceptions can do something similar: returning normally means true, and throwing an exception means false. Most languages with good support for exceptions use them extensively in their standard libraries, so functions that return exception Booleans are already common.

It's not often recognized, though, that one can define operators on these Booleans. I briefly thought Perl 6 did, based on a misunderstanding of its andthen and orelse operators, but it turns out they operate on defined vs. undefined values, not exceptions. (Perl, true to form, has several ways to represent booleans, one of which is defined/undefined.) However, there are a few exception-Boolean operators in existing languages, even if they're not usually described that way:

  • Ordinary progn/begin is the exception equivalent of and: it evaluates each subform only if the previous ones returned normally.
  • Many languages have an (assert expression) operator, which converts a value Boolean into a exception Boolean.
  • Common Lisp's (ignore-errors expression t) converts an exception Boolean into a value Boolean, returning nil on error and t otherwise. (However, it only handles errors, not all serious-conditions.)
  • Some test frameworks have a (must-fail expression) form, which throws an exception iff the expression doesn't. This is the exception-Boolean equivalent of not.

Given macros and an exception-handling construct, you can easily build others, such as or and if:

(defmacro succeeds (form)
  "Convert an exception Boolean to a value Boolean: return
true if this form returns normally, or false if it signals a
  `(handler-case (progn ,form t)
     (serious-condition () nil)))

(defmacro eif (test then &optional else)
  "Exception IF: THEN if TEST returns normally, otherwise ELSE."
  `(if (succeeds ,test)

(defmacro eor (&rest forms)
  "Like OR, but treating normal return as true and
SERIOUS-CONDITION as false. Perhaps surprisingly, (EOR)
signals an exception, since that's its identity."
  (cond ((null forms) `(error "No alternatives in EOR."))
 ((null (cdr forms)) (car forms))
 (t `(handler-case ,(car forms)
       (serious-condition () (eor ,@(cdr forms)))))))

These operators may be a fun exercise, but are they of any practical use?

There is a tension in the design of functions that might fail in normal operation — those that depend on the cooperation of the environment, like opening a file or making a TCP connection, and even those that return something that may or may not exist, like hashtable lookup. Such functions face an awkward choice between indicating failure by their return value (e.g. returning nil, or a Maybe) or throwing an exception. A special return value is most convenient for callers who want to handle failure, because they can easily detect it with ordinary operators. But it often forces those who can't handle the failure to explicitly check for it anyway. An exception, on the other hand, is perfect for callers who can't handle failure and just want it to propagate up automatically, but awkward for those who can handle it, since in most languages it's much easier to check for nil than to catch an exception. So functions are forced to choose how they report failure based on guesses about how often their callers will want to handle it.

Exception logic might ameliorate this. If you can handle exceptions as easily as return values, then exceptions are no longer at an expressive disadvantage. This means you can use them more consistently: virtually any function that might fail can indicate it with an exception — even the commonly tentative ones like dictionary lookup. So exception operators are interesting not for their own sake, but because they may allow other parts of a language to be more consistent.

Like all error handling features, this is hard to experiment with in pseudocode, because it's so easy to ignore error behaviour. It's also hard to experiment with in a real language, because its value depends on changing so many library functions. So I wouldn't be surprised if I've overlooked some common case where exceptions are still inconvenient. But this sort of thing is worth looking into, because it could make libraries simpler and more consistent.

Haskell and syntactic complexity

Why isn't if a function in Haskell?

In most languages, if is a special form (or local equivalent). That's because if is lazy in its last two arguments, so it can't be a function in an eager language. But Haskell has no trouble with laziness; if can trivially be a function, with no need for special semantics, let alone syntax. There's even a page on the Haskell Wiki advocating switching to the simple functional if.

All this must have been obvious to the designers of Haskell. So why did they choose the Algolesque if ... then ... else ... syntax instead?

Of course this question has come up before. Paul Hudak, one of the designers, explains:

Dan Doel's answer is closest to the truth:
I imagine the answer is that having the syntax for it looks nicer/is clearer. "if a b c" could be more cryptic than "if a then b else c" for some values of a, b and c.

except that there was also the simple desire to conform to convention here (I don't recall fewer parentheses being a reason for the choice). In considering the alternative, I remember the function "cond" being proposed instead of "if", in deference to Scheme and to avoid confusion with people's expectations regarding "if".

To my eye, if a then b else c improves neither clarity nor familiarity, but evidently the Haskell designers thought otherwise. More interestingly, they didn't think the added syntactic complexity was much of a problem. How can this be? Doesn't Haskell culture value simplicity?

Yes, but not so much simplicity of syntax. I'm misled here by a superficial resemblance between the cultures of Haskell and Lisp. Both cultures are obsessed with mechanically processing code, and therefore want the language core to be as simple as possible. Since a minimal core is impractical to program in, both expect a larger, more useful language to be defined by translation to the core, so its complexity can be mechanically eliminated. And both consider the representation of code as text to be separate, at least in principle, from the core. So at first glance, it seems as if they should have the same attitude to syntactic complexity.

But they treat it quite differently. Lisp's culture considers syntax unimportant, and therefore tries to make it as simple and transparent as possible, so it won't prevent humans from seeing through it — because code is much more interesting than its textual representation. But Haskell's culture considers syntax safely separate from the language core, and is therefore willing to tolerate complexity in it. Since it goes away without inflicting any complexity on the core, why shouldn't it include whatever features are convenient?

This approach has given Haskell one of the prettiest syntaxes in the world, so I can't say it's wrong. It is uncomfortable, though, for a Lisper to see a culture which draws wildly different conclusions from seemingly similar premises. Maybe this is explained by modest differences in degree: Haskell values textual beauty a bit more than Lisp, and code-processing a bit less. Is that enough to explain the difference? I don't know. I don't understand.

From Python to C++

A few days months ago someone asked me (well, a group that happened to include me) for advice on learning C++, for a Python speaker.

Ouch. What do you tell someone who's getting started with C++? What can you say that won't scare them away?

You could tell them the good reason behind many of C++'s complexities: that it's lower-level than Python, especially in data representation. You could tell them that's good when you care about efficiency and bad the rest of the time. But they probably already know that, and it's not the most useful sort of information.

You could give them bits of good advice, and warn them of the dangers. There are hundreds of specific tips and tricks they'll need to learn, but they wouldn't understand most of them until they knew more of the language. Failing that, you could at least give them some general warnings:

  • There is much less agreement on good style. There are often multiple incompatible accepted ways to do something, like, say, basic I/O. Some of the advanced features are painful to the point of uselessness.
  • Parts of the standard library are dangerously broken.
  • For reasons of backward-compatibility and low-level-ness, C++ is complicated. Don't expect to keep the whole language in your head, the way you can for Python.
  • Collection support is pretty bad.
  • There is a way to do anything, but it may be more trouble than it's worth.
  • A lot of C++ tutorial material appears to be written for and by idiots.
  • Manual memory management is hard. If it accounts for half of your bugs, that's normal.

But they'd forget all that before they learned enough to apply it. Is there a shorter summary of the differences between C++ and Python?

I'd put it this way: you can't trust C++. Python may have a few warts, but for the most part it makes sense, because someone designed it. Most features are there for a good reason, and the obvious way to do things is generally the right way. The language is a garden, planted with useful things, and its pathways are generally safe and will lead you somewhere useful.

C++, on the other hand, is a wilderness. It may have almost as many fruits and flowers as the garden, but they're hidden among poisonous plants and thorns, and guarded by poisonous snakes and tigers. There are paths, but they're hard to follow, and some of them lead off cliffs or into swamps or snares. To navigate C++, you must be your own guide, because the language won't help you.

I don't hate C++. I admire the degree to which it can support higher-level constructs without losing its low-level powers. But it's a hard language to introduce someone to, especially someone who's coming from Python and therefore expects languages to make sense.

Why the imperative examples?

Speaking of example programs: why are so many examples stateful? Wherever algorithms are explained — in textbooks, in academic papers, in websites formal and otherwise — the style of choice is overwhelmingly imperative. Assignment is everywhere; simple maps and folds are written as explicit iteration; data structures are updated destructively for no reason; even recursion is often expressed by loops and state. This is true even in pseudocode, where there's no practical reason not to write functionally. This annoys me. Why can't we write our examples clearly?

Some of this must be simply the inertia of formality. Imperative style used to be the only well-known option, so it's traditional, and when people try to look authoritative by imitating older authorities, they tend to copy their style, even if they wouldn't normally write that way. This happens for writing style, so why shouldn't it happen for programming style too?

Didactic examples also have a good reason to be conservative in the language they use: they're explaining something to someone who doesn't already know it, and may be ignorant of other concepts as well. State and iteration are familiar even to nonprogrammers, while many of the concepts of functional programming are not very widely known even among programmers. So maybe the imperative examples are comprehensible to a larger audience?

And maybe there's a cultural gap: if the people who study (and explain) algorithms have little overlap with those who study how to express them, they may simply be unaccustomed to functional style, and prefer something more familiar. But it seems to me I've seen oddly imperative examples even from people who know better and would never write “real” code that way.

I also suspect that state, especially in small programs, isn't as confusing as we like to think. Little examples don't have the maintenance and modularity problems that large programs do, so maybe state isn't much of a problem for them, and there's not much pressure to write more functionally.

But it still irks me. Is the state of the art, in 2010, to describe algorithms in terms of assignment? Must we communicate new ideas by implementations we would consider unreadable in any other context?

I know there are some people who explain algorithms with functional examples, especially in the Haskell community. But why isn't this the norm?

Expressiveness isn't always good for examples

One of the most common ways to learn a new concept is by reading an example implementation. I expect that most programmers, given a choice, prefer to see the example in as high-level a language as possible, to minimize the amount of irrelevant detail they have to ignore. I do, anyway... don't I?

Recently I saw an unfamiliar statistics algorithm illustrated in C++ — in particular, in a very basic subset of C++, with one class and no pointers or templates. To my surprise, I found that reassuring. The use of an unexpressive language made it obvious that there was nothing fancy going on — no fancy control flow, no infinite data structures, no high-order functions, no polymorphism. C++ has many potentially confusing features (e.g. templates, overloading), but in a one-page example, it was easy to see that none of them were used. It was also obvious that the algorithm didn't depend on any exotic language features like lambda or laziness or bignums or garbage collection, so I could easily use it in any language.

A more powerful language might have expressed the example in less code, but not a lot less — it was largely arithmetic, which C++ is not bad at. And in a more powerful language, even a smaller amount of code might have expressed many more plausible things. The choice of an obviously unexpressive language ruled out large classes of things the example wasn't doing, allowing me to more quickly understand it without getting lost in a large space of possibilities.

On the other hand, I have already forgotten the algorithm, so maybe I didn't understand it very well after all.

Java gets newline handling right

The platform-dependent representation of newlines (as CR, LF, or CR+LF) is a surprisingly persistent annoyance. In theory, this problem was solved decades ago, when I/O libraries began to automatically convert from the host platform's newline convention to a portable in-memory representation. But in practice, files are routinely shared across platforms without being converted along the way, so a file may contain any of the three newline sequences, or even a mixture of them. (No, really, I've run into this repeatedly.) Knowing the host platform's newline convention doesn't help when the data doesn't follow it.

The obvious solution is to recognize all three newline endings, and treat them all identically. Unfortunately this creates ambiguity: a character such as CR might be a newline, or it might by a carriage return that occurs in the middle of a line, and the two will read identically. This means attempting to copy a file by reading and writing lines won't always preserve the original file. But this isn't a problem for most programs, because those that read lines of text generally only want input that makes sense as text, and that doesn't include arbitrary control characters. (Compare to C's restriction on the null character: it's valid but rarely plausible, so the restriction is not usually a problem.) As long as there are other operations for reading data in its raw form without interpreting it as text, the operation for reading lines need not distinguish between newline conventions.

It's also convenient to do newline conversion in the read-line operation, not as a feature of the underlying stream. This makes it easy to reversibly read text with control characters, and also makes it easier to mix text and other data in one stream. As a bonus, it eliminates the need to specify text or binary mode when opening a file.

My brilliant ideas have, of course, already occurred to someone else, and been implemented in a popular language: Java. From the documentation for java.io.BufferedReader:

public String readLine()
                throws IOException

Reads a line of text. A line is considered to be terminated by any one of a line feed ('\n'), a carriage return ('\r'), or a carriage return followed immediately by a linefeed.

A String containing the contents of the line, not including any line-termination characters, or null if the end of the stream has been reached

Wrapping a stream in a BufferedReader is awkward, but readLine does exactly what I want. Not even Perl solves this text processing problem so conveniently! Java is not rich in such conveniences, so I was surprised to discover that it has this one. It may be motivated by portability: Java wants programs to behave identically on all platforms, and platform-dependent newline handling violates that.

C# (or rather .NET) follows Java here. So, presumably, do most JVM and .NET languages. I suspect there are others that have either borrowed this feature or invented it independently. Which ones?

Update April 2014: Python calls this “universal newlines”, and has had it since 2003.

(It would be nice to detect text encodings the same way, since many streams have unknown or mislabeled encodings. Unfortunately, there are byte sequences that are not only possible but plausible in more than one encoding, so there is no solution reliable enough to be the default.)