The first problem is that the docs are frustratingly introductory. Everything I can find is aimed at explaining the language to beginners, with lots of repetition and very little precision. Would it be so hard to explain the language kernel in a way someone who knows languages can understand? The closest I can find to a description of the semantics is the overview of evaluation and one entry in the FAQ. There are grandiose statements about how different Rebol is from other languages, and how it's supposed to take over the world, but not much explanation.
Okay, even from the beginner documentation you can learn something. Rebol is intended for making little DSLs - so, like Lisp, it separates the syntax from the language. Programs are defined in terms of trees, not text, and the trees are easily available to interpreters. But Rebol also tries to preserve a familiar appearance of programs - infix, and without too many parentheses. The usual way to do this is to keep the trees as abstract as practical, and have the parser transform complex syntax into a simple underlying representation. Rebol does the opposite. It preserves most of the syntax in its parse trees, so the parser is simple but the interpreter is complicated. In Rebol,
2 * sin :x reads as a four-element sequence:
[2 * sin :x]. The advantage is that code closely resembles its source text; the disadvantage is that even parsed code is rather complicated and awkward to operate on.
The parser is not actually all that simple, because Rebol has syntax for a great many data types. In addition to the four kinds of symbols and three kinds of sequences used for programs, there are a variety of other types with syntax. The complete list is:
- Word (symbol; 4 kinds):
foo: bar :baz 'error-out-of-metasyntactic-variables
- Block (vector):
[1 two [three] "IV"]
- Paren (same as Block, but different eval semantics):
(alpha [beta * gamma] delta)
- Refinement (generalized field accessor):
- Path (field access expression):
- "Decimal" (actually binary floating point; there's nothing decimal about them):
10.0 2,718'281'828'4 6.023E23
- File (pathname):
#123-456-789(how is this usefully different from String?)
2008-07-26 26-Jul-2008 7/26/2008
- Tuple (of 3 or more small integers):
- Pair (2-D point):
There are only a few types that don't have syntax:
- Logic (boolean):
true false yes no on off(these are constants, not syntax)
- None (nil):
- Hash (hashtable)
- Image (bitmap)
Curiously, much of the syntax is defined by special kinds of tokens rather than with special characters. It's not just numbers, but dates, times, and even pairs. This makes it a bit hard to tell how a token will be parsed, and thus whether an unusual symbol needs to be escaped. It seems to me that Rebol goes a bit overboard on this. I'm not sure why dates or email addresses need to be handled by the parser rather than by the few operations that use them. The syntactic support is slightly convenient for some small programs, but I have a hard time believing it helps with larger ones.
Of course a language's semantics are more important than its syntax, and it's tricky to figure them out from a tutorial. The obvious alternative is to read the source - but here's the other reason I stopped looking at Rebol: it's not open-source. “REBOL/Core is free and will always be free”, says the license page, but they only mean gratis. Quite frustrating to someone who might understand it by looking at
eval, or to anyone considering using the language. Joe Marshall (who used to work on Rebol) apparently wrote a Rebol-to-Scheme compiler, but the actual code seems to have vanished from the face of the Internet. (Update: Here's a copy. Thanks, anonymous!) Fortunately, bits of a Rebol interpreter survive in Joe's talk on the difficulties of doing tail-call optimization. From these, and from the documentation, I think I understand Rebol's
eval now, though not with enough confidence to explain it formally - go read that talk if you want to know.
The odd thing is that it operates on sequences, so it has to parse as it evaluates. The two operations which consume arguments (assignments and function calls) evaluate as many as they need from the rest of the sequence, not knowing where an expression ends until they've evaluated it. (This is why tail-call optimization is hard: you can't tell whether you're in a tail context until the last moment.) After evaluating, if there's nothing left, then you just did a tail call, and its result is the value of the block. If there is something left, and it begins with an infix operator, then what you just evaluated was its first argument, so you evaluate another argument and call the infix operator. (This means all infix operators are left-associative.) This is what it takes to interpret unparenthesised prefix with variable arity, but it's certainly awkward.
Here's something disturbing: Rebol 1.0 had lexical scope, but Rebol 2.0 doesn't. Blocks are just self-evaluating, and it's up to functions (or should I say FEXPRs?) like
if to evaluate them in some environment. (However, this article suggests words know something about their environment.) I don't know how assignment handles nested scopes. It is worrying, especially in a poorly specified language with no reference implementation, to see what appears to be a well-known mistake, and the possibility of several others. I don't trust my intuition when evaluating a language, but I'm beginning to get the impression that Rebol has a lot of cute features on a rather shaky foundation.
Trying to reconstruct a language from introductory documentation is a frustrating task, so I'll stop here. Unfortunately I've exhausted my patience before learning much about its library or how it works in practice, so I still don't know what the good details were that Edoc was talking about. Maybe some other time.