Overloading in math

A few months ago Joe Marshall did a series of posts about how computer science has affected his view of math, beginning with its surpising lack of rigor: unlike programming languages, math notation is often ambiguous.

Check out these two equations:


By the first equation, Q is obviously some sort of function (the quantile function). H therefore operates on a function. By the second equation, the argument to H, which is called x, is used as the argument to f, which is the probability distribution function. f maps numbers to numbers, so x is a number. The types don't match.

This type error is precisely why this notation works. We can write Q to mean both a function and its result (or, to put it differently, both a quantity and a function relating that quantity to something else) because the types differentiate them. It's just like overloading in programming languages, but it's more pervasive, and it's not declared in advance, and one of the overloaded variables isn't even a function.

Indeed, if both Qs were functions, or even if Q returned another function instead of a number, this overloading wouldn't work. It would be too easy to confuse the two. But since their types are so different, we can safely overload arbitrary variables with the functions that compute them, and only programmers will be confused.

Programmers are paranoid about ambiguous notation, because computers can't be trusted to resolve it right. Mathematicians are much less so, because humans handle it better.

I think we can learn something from the mathematicians here. Language designers sometimes reject new overloadings for fear they'd be confusing, even if they wouldn't pose a problem for compilers. For example, overloading arithmetic operations to work on collections or functions sounds scary — wouldn't it be hard to tell what was being added and what was just being mapped or composed? But similarly wide-ranging overloadings in mathematical notation don't cause much trouble, and indeed these specific overloadings exist and work fine in APL and J. When a new construct looks confusing to humans but not to machines, that may be a sign that it's just unfamiliar.

4 comments:

  1. Operator overloading, where the operators are used for semantically different kinds of operations, is apt to cause confusion in edge cases.

    You talk about overloading operators on containers. Let's say you overload + for appending. What happens when you try to append an integer expression to a list of integers? The + starts looking ambiguous.

    C++ iostreams is a nice example of how not to overload operators. Not only do the stream operators make outputting integer bitwise shift expressions tricky, but they can even change the apparent order of execution.

    Using + to concatenate strings in a statically typed language isn't too bad, because in the absence of automatic conversions to and from strings, there's little risk of confusion. General collections would probably be a step too far though.

    ReplyDelete
  2. I'm thinking of overloading + for (map + ...), not for something semantically unrelated like appending. This preserves the useful properties like associativity, and makes vector arithmetic naturally work: (* #(1 2 3) 2) => #(2 4 6). Since it's already used in math, we know this overloading isn't particularly confusing.

    ReplyDelete
  3. That kind of overloading is just lifting + to a new domain: it's still addition. Note that APL uses unique notations for inner and outer product, distinct from multiplication; indeed, the inner product notation is a special case of a general map-reduce notation.

    This is all very different from confounding functions with their values at some arbitrary argument. The fact is, mathematical notation suffers because it has no representation of functions except lambda, and most mathematicians wouldn't recognize a lambda if it bit them in the ass.

    ReplyDelete
  4. Eh, next you'll hear that it's a "legacy language". Fashion is a whore.

    ReplyDelete

It's OK to comment on old posts.