Much of this post seems familiar to me, as if I've seen it somewhere else, perhaps on LL1-discuss or comp.lang.*. But I can't find the post I remember, so maybe I'm imagining someone else saying what I'm thinking.
Macros are flexible, and unfamiliar to most programmers, so they inspire a lot of confusion (more, in my opinion, than they deserve, but that's a topic for another day). Sometimes people try to make sense of this confusion by classifying them into a few categories. These classifications typically include:
- Macros that evaluate some arguments lazily, like
if
and and
, or repeatedly, like while
.
- Macros that pass some arguments by reference rather than by value, like the
setf
family.
- Binding macros that simply save a lambda:
with-open-file
. In languages with very terse lambda (like Smalltalk) these are not very useful, but in languages that require something like (lambda (x) ...)
, they're useful and common.
- Macros that quote some arguments (i.e. treat them as data, not expressions).
- Defining macros like
defstruct
.
- Unhygienic binding macros:
op
, aif
.
The reasons for the classifications vary. Sometimes the point is that all of the categories are either trivial or controversial. (The people making this argument usually say the trivial ones should be expressed functionally, and the controversial ones should not be expressed at all.) Sometimes, as in this case, the point is that some of the categories are hard to express in any other way. Sometimes the point is that some categories are common enough that they should be built in to the language (e.g. laziness) or supported in some other way (e.g. terse lambda) rather than requiring macros.
These classifications aren't wrong, but they are misleading, because the most valuable macros don't fit any of these categories. Instead they do what any good abstraction does: they hide irrelevant details. Here are some of my favourites.
Lazy cons
If you want to use lazy streams in an eager language, you can build them out of delay
and eager lists. But this is easy to get wrong. Do you cons an item onto a stream with (delay (cons a b))
? (cons (delay a) (delay b))
? (delay (cons (delay a) b)
? Something else?
This is hard enough that there's a paper about which one is best and why. Even if you know (and regardless of whether you disagree with that paper), it's easy to make mistakes when writing the delay
s by hand. But the exact place where laziness is introduced is an implementation detail; code producing streams doesn't usually care about it. A lazy-cons
macro can hide that detail, so you can use lazy streams without worrying about how they work. That's what any good abstraction should do.
Sequencing actions
Haskell's do
is not, officially, a macro, but this is only because standard Haskell doesn't have macros; in any case do
is defined and implemented by macroexpansion. Its purpose is to allow stateful code to be written sequentially, in imperative style. Its expansion is a hideous chain of nested >>=
and lambdas, which no one wants to write by hand (or read). Without this macro, IO actions would be much more awkward to use. Some of this awkwardness could be recovered through functions like sequence
, but the use of actions to write in imperative style would be impractical. do
hides the irrelevant functional plumbing and relieves the pain of something necessary but very un-Haskell-like. Really, would you want to use Haskell without it?
List comprehensions
Haskell's list comprehensions, like its do
, express something that could be done with functions, but less readably. List comprehensions combine the functionality of map
, mapcat
, and filter
in a binding construct that looks a lot like set comprehensions. They save having to mention those list functions or write any lambdas.
I sometimes wish there was a way to get a fold
in there too, but it's a good macro as it is.
Haskell list comprehensions wear a pretty syntactic skin over their macro structure, but this is not essential. Clojure's for
demonstrates that a bare macro works as well.
Partial application
Goo's op
(and its descendants like Arc's [... _ ...]
and Clojure's #(... % ...)
) is an unhygienic binding macro that abbreviates partial application and other simple lambdas by making the argument list implicit. It hides the irrelevant detail of naming arguments, which makes it much terser than lambda
, and makes high-order functions easier to use.
Language embedding
There is a class of macros that embed other languages, with semantics different from the host. The composition
macro from my earlier posts is one such. A lazily
macro that embeds a language with implicit laziness is another. The embedded languages can be very different from the host: macros for defining parsers, for example, often look nothing like the host language. Instead of function call, their important forms are concatenation, alternatives, and repetition. Macros for embedding Prolog look like the host language, but have very different semantics, which would be awkward to express otherwise.
Like do
, these macros replace ugly, repetitive code (typically with a lot of explicit lambdas) with something simpler and much closer to pseudocode.
The usual tricks
Most macros do fall into the simple categories: binding, laziness and other calling conventions, quotation, defining, etc. It's easy to think, of each of these uses, that it ought to be built into the language so you don't have to “fake” it using macros.
Fake? There's nothing wrong with using a language's expressive power to supply features it doesn't have! That's what abstraction is for!
The C preprocessor is a very useful thing, but of course it has given macros a bad name. I suspect this colors the thinking even of people who do know real (i.e. tree) macros, leading them to prefer a “proper” built-in feature to its macro implementation.
From my point of view, a macro is much better than a built-in feature. A language feature complicates the language's kernel, making it harder to implement, and in particular harder to analyze. Macros cover all of them, plus others the designers haven't thought of, in a single feature — and they don't even complicate analysis, because they disappear when expanded, so the analysis phase never sees them.
(To be fair, macros do require the language's runtime to be present at compile-time, and create the possibility of phasing bugs. But either interactive compilation or self-hosting requires the former anyway, and the latter only interferes with macros, so at worst it's equivalent to not having them. Neither is remotely as bad as being unable to express things the language designer didn't think of.)
So I see macros not as a weird, overpowered feature but as an abstractive tool nearly as important as functions and classes. Every language that aims for expressive power should have them.