Any abstraction expresses something all its instances share. Usually this is semantic: all instances have some common meaning. They represent the same values, or perform the same computation, or support the same operations. When you know a construct is an instance of a certain abstraction, you know something about what it *means*.

Some abstractions are different. Their instances have no meaning in common, only algebraic properties. These are purely algebraic abstractions. Such an abstraction tells what transformations are valid on its instances' expressions, but says nothing about what they mean.

The classic algebraic abstractions are (of course) those of abstract algebra: groups and rings and fields and such. They abstract the properties necessary for algebraic transformations, and nothing else. If you know that your objects and their operators form a ring, you can manipulate formulae and even prove theorems about them, without knowing anything about what they mean.

In contrast, most abstractions in computing focus on meaning, and express algebraic properties only incidentally. If you have a `java.util.Collection`

or a `java.util.Map`

, you know you can add and remove items, test whether they're there, and iterate over them — but do you know any algebraic properties? Even the most basic properties are broken by unusual collections like caches or Bloom filters. They're semantically legitimate collections, and their algebraic properties are unreliable because they're irrelevant.

(Algebraic abstractions are not entirely reliable either, because most of their computational incarnations don't quite satisfy their axioms. Reflection and other debugging features can often detect differences between supposedly equivalent objects. Limitations of memory and time create edge cases where expressions equivalent in denotation are different in operation. Floating-point arithmetic makes a sport of breaking nearly every algebraic property that could be expected of it. But similar problems afflict nearly all attempts to reason about abstractions; they're not specific to algebraic abstractions, and they don't make them useless. The equivalences generally hold for the properties we care to preserve, so they're correct in practice though not in theory.)

### Haskell typeclasses

Most Haskell typeclasses have semantic content: `Show`

and `Num`

are about operations with the same meaning for all their instances; `Eq`

expects algebraic properties (reflexivity and transitivity) but still defines the meaning of `==`

. There are a small but increasing number of purely algebraic typeclasses: the Prelude has `Monoid`

, `Functor`

, `Applicative`

and `Monad`

, whose instances have nothing in common but algebraic equivalences.

This is why monads are so hard to learn. Each student of Haskell asks what monads *mean*, and invents a variety of wrong answers (typically semantic generalizations of `IO`

actions), because they're sure that such an important abstraction must be meaningful, and have never heard of algebraic abstraction. Eventually they learn to use monads without asking what they mean, because monads don't mean anything.

This is a sore point among Haskellers. It will get more sore, because Haskell is gaining more algebraic abstractions. `Applicative`

is in the Prelude now!

### Haskell Prime numbers

Haskell Prime is a collection of ideas for future versions of Haskell, including a proposal to generalize the numeric typeclasses by removing their semantic content, replacing `Num`

and most of its subclasses with purely algebraic classes:

```
(+) :: AbelianGroup a ⇒ a → a → a
(*) :: Ring a ⇒ a → a → a
(/) :: DivisionRing a ⇒ a → a → a
mod :: EuclideanDomain a ⇒ a → a → a
```

(A division ring is like a field except that multiplication is not necessarily commutative.)

This makes the numeric operations maximally general, at the cost of making them meaningless. It also gives mundane code types (and type errors) that make sense to mathematicians and no one else:

```
factorial :: (Ring a, Ord a) ⇒ a → a
sum :: AbelianGroup a ⇒ [a] → a
```

(I'm not sure why `+`

is on `AbelianGroup`

instead of something more general like `Magma`

. Maybe it's to comply with users' expectation that `+`

be associative and commutative.)

This proposal brings the straightforward clarity of `Monad`

to arithmetic, an area where Haskell has long suffered from comprehensibility bordering on practicality.

I'm not sure algebraic abstractions are always a bad idea for programming languages, but the difficulty of `Monad`

suggests they're hazardous.

A semantic abstraction tells you what its instances mean. An algebraic abstraction only tells you what transformations preserve that meaning. That's enough for optimization, but not for understanding.

The correct term for communicating purely algebraic abstractions like monads to software people (rather than maths people) isn't abstraction; it's design pattern. That gets people into the right frame of mind, away from specific applications, and out into meta talk about the shape of a design. The fact that design patterns have cropped up in two different guises in different approaches to computation suggests that there's something there, and that it's probably worth formalizing.

ReplyDeleteAre you sticking with such low level concepts as division rings to pander to the hacking instincts of programmers? Obviously a properly grounded mathematical framing in terms of homology theory, topos, and Grothendeik primes would be far more rigorous. If we treat e.g. Haskell maps as frictionless points in co-functor generalizations of Jordon-Holder, we could put the whole enterprise of programming on a solid footing.

ReplyDelete