Better Rebol info

My Google-fu is weak. Rebol's standard documentation may not be good, but there are better alternatives, and I didn't find them. Fortunately commenters supplied links and hints on what to search for, leading to at least:

  • A better tutorial that shows off Rebol's GUI library. How many languages can do one-liners with GUIs? How many language designers even consider the possibility, despite its obvious value?
  • Some articles on Rebol's semantics, including a metacircular interpreter. They're not entirely clear but they may be the best available.
  • ...and even a surviving copy of Joe Marshall's Rebol-to-Scheme compiler. (But that's for Rebol 1.0; the semantics have changed since.)

If I'm reading this right, Rebol's bind (which is implicit in evaluation) makes closures, but it doesn't store the environment once per function - it stores it once per word! That presumably makes func (λ) more expensive, but allows lexical scoping of code that's passed around without being compiled first. Which is very common in Rebol - one of the main ways to extend the language is to define interpreters that operate on blocks. Because blocks are self-evaluating, there's no syntactic overhead to this - it's just like FEXPRs. Rebol appears to have actual FEXPRs too, via unevaluated arguments, but they're not needed much.

Next (when I get around to it): investigating the libraries.

4 comments:

  1. The binding model I used for Rebol 1.0 was basically a standard implementation of lexical binding. A lexical environment was carried around by the interpreter and when values were needed they were looked up. The only twist was that Carl wanted symbol instances to have lexical scope. So suppose you had this rebol function:

    func [a] [ return 'a ]

    (OR whatever the syntax was, I forget). The symbol that is returned can be evaluated in any context and it will return the value that was passed in to func. The trick was to close over the symbol. I had to tweak some of the other symbol routines to deal with this idea of closed over symbols, but it worked. (I don't think it was a good idea, but it was easy to implement.)

    Carl never seemed to `get' how lexical environments work. The shape of the environment is constant from invocation to invocation, and this is how you get lexical addressing, but the instance of the environment changes.

    When Carl ditched my code he went back to his original plan of allocating a binding cell for each lexical mention of the symbol in the source text. Imagine your source is like a Christmas calendar with the chocolate goodies behind each date. Behind each binding location is a box where the value is stored. This model has bizarre semantics. The first (and most obvious) problem is that it isn't re-entrant. If you recursively call a function, you'll end up smashing the values that are there for the outer invocation. The first release of Rebol 2.0 had this bug.

    Carl patched this up with the following hack. If a cell is unbound, then binding it causes it to be assigned a value. If the cell already has a value, that value is saved away on the stack while the function is called, and then restored to the cell when the function returns. That fixes the re-entrancy problem, but you still have other issues.

    The mechanism of saving the old binding away on the stack is plain-old bog-simple shallow dynamic scoping. But because Carl has a value cell for each binding site, rather than a single global cell, you only see the dynamic binding effect within a single function at a time. This makes it less likely that you'll suffer from inadvertant variable capture, but it doesn't eliminate it completely.

    Carl's hack of leaving the previous binding in place if the cell was unbound before gives you a strange effect that variables can be used for some time after they are bound. Unfortunately, if a call to the routine that bound them occurs, the value gets smashed. This is `indefinite extent' in the truest sense --- you simply can't tell how long the variable will retain its value.

    So Carl's binding methodology is a weird cross behind static binding, where each formal parameter has its own value cell, and shallow dynamic binding where the value is saved on the stack when a function is re-entered.

    Carl's implementation has one other weird feature. If you copy a block of code, you also end up copying the value cells that are associated with the bindings in the code. Some rather enterprising Rebolers have used this trick to implement a `poor-man's lexical binding' by unsharing the dynamically bound value cells before leaving a function and thus getting the value to persist.

    ReplyDelete
  2. Ooh, details! Thanks!

    Was there a use in mind for the lexically scoped symbols?

    ReplyDelete
  3. Well, not a particular use per se, but Carl wanted it to be the case that if you forced the evaluation of a symbol or block that you'd get the lexically scoped value. So you could return a block as a list of tokens and then call `do' on it and run it as if it were a delayed value. That's an interesting idea, but extending it to symbols themselves is probably going too far.

    ReplyDelete
  4. Oh well, I guess we'd need a blog post from Carl, but let me give my two cents here.

    First, Joe, the problem is that your binding model in 1.0 was function based, while REBOL is independent from functions. That may sound like an heresy to you and others, but this is one of the big difference between REBOL and, say, scheme. (It is also to be said, for fairness, that your implementation was 10 times slower than Carl's implementation. I agree that he had to use a few tricks to get there, like the one you mention of saving a function's context values on the stack, but they were worth it.) Indefinite extent was not lost because of the stack trick: yes, functions don't provide that anymore, but 99% of functions don't need that. In the case you need, you dynamically create a new context with USE or MAKE OBJECT!. The following:

    use [a] ['a]

    has basically the same effects that your above example in 1.0 had, and the context is left around for as long as it is referenced. The big difference is that there is no hidden lexical environment that has to be passed around by the interpreter. Everything is first class (and contexts and functions are separate, though each function has a context, that is reused for performance reasons, using the stack trick you mention). Having used both your 1.0 implementation and Carl's 2.0 implementation for what's like 8 years now, I can say with confidence that Carl was right here.

    REBOL 3.0 makes things even more clear here. There are two kinds of functions: function! and closure!. function! uses the stack for arguments, and words are bound relative to the stack. They cannot be referenced outside of the function invocation (they can be by a another function that is called by it, but not after it returns). They are fast, binding happens only on function creation, and they are what most users will use most of the times.

    closure! instead creates a new context at each invocation. This means that the function body has to be copied and bound to that new context at each invocation. This is slow, but it gives you the equivalent of your above example, faster than your original implementation (closure! is definitely not 10 times slower than function! - it just has a bigger call overhead).

    I don't think it's really Carl that never got how lexical environments work. Instead, you never seemed to get that REBOL was not about lexical environments but about representing all of code (including binding) with first class values.

    This old thread may give some more light on how "scope" works in REBOL. I don't think it can be called "scope" in the traditional sense, but I'm sure Joe can tell us how it's called precisely in the literature.

    ReplyDelete

It's OK to comment on old posts.