It's often useful to distinguish between parts of a language based on whether they can be expressed in user code:
- The kernel is the part the compiler has to know about.
- The library is the part that can be implemented in user code.
- Primitives are the parts that are provided by the implementation, but could be expressed in user code.
This is an obvious and possibly familiar distinction, but it's underused. It should be part of the standard vocabulary for talking about languages.
Some examples in Scheme:
letis library, because it can be implemented as a macro. (It could be implemented as a macro in Common Lisp too, but for historical reasons it's in the kernel.)
procedure?is primitive, because while it could be implemented in user code, there's nothing it could be implemented in terms of. In a language with
type-of, it's library.
- The Form Without A Name (the function call operator) is kernel, because there's no way to define it, or anything like it, in user code. Even in Racket, where
#%appcan be redefined, there is no way for user code to define such a construct.
case-lambdacan express each other as macros. At least one of them needs to be predefined, but the choice is arbitrary. Since the compiler needs to know about that one, it's kernel and the other two are library.
call-with-current-continuationis kernel, not just a primitive, because it can't be implemented without changing how function calls are compiled.
consis a primitive in R5RS, because there's no way to define new datatypes. (Defining pairs as tagged vectors doesn't count, even if you redefine all vector operations to reject them, because existing code will still see them as vectors.) In R6RS, it can be library, because it could be defined with
define-record-type. (However, it's often still primitive, for efficiency or for historical reasons.)
syntax-rulesis mostly library, but requires some primitive support.
define-syntaxis kernel, but in a language with a hook to define new code-transformation passes in user code (kind of like Perl's source filters), it could be library.
- The distinction between kernel and library has nothing to do with what names are standard or in the default namespace.
mapis standard and predefined but is pure library; if
is provided in a separate module, it's still kernel, because it requires compiler support.
Garbage collection is kernel, because it requires compiler (and runtime) support.
Some things to say using these terms:
- Library has to be implemented in terms of something, so primitives are necessary. There are often additional primitives for efficiency (especially in slow implementations), but most primitives add functionality though would be otherwise unavailable.
- When two constructs can express each other, either one could be primitive; the choice is often arbitrary.
- A foreign function interface can be seen as an endless source of primitives. This is one of the reasons FFIs are so important for new languages: they compensate for a shortage of primitives.
- Library, unlike the other two parts, is potentially portable between implementations. If a language provides a good reference implementation of its library (as most SRFIs do), this makes it much easier to implement.
- Complexity in the kernel is much more expensive than in primitives or library, for both implementors (because it complicates the compiler) and users (because they can't understand the kernel according to the same rules as user code). Primitives, however, aren't necessarily more expensive than library.
Expressive features are particularly interesting when they allow the language to express things in library that would otherwise have to be in the kernel. Such features can often pay for their complexity by simplifying the kernel. For example:
- In most languages, all abstract syntax (and concrete syntax too, for that matter) is kernel. In languages with macros, most abstract syntax can be library, leaving only a few special forms in the kernel. In a fexpr language, even the special forms are mere primitives, because the compiler doesn't have to know anything about them. (It might still know, for efficiency, but it doesn't have to.) Kernel has a very small kernel.
- Exposing compiler data structures such as environments allows user code to do things otherwise possible only in the kernel. In Scheme, there is no programmatic way to create definitions, so
define-valuesare kernel. In Common Lisp, or in a Scheme extended with a
define!procedure that creates top-level definitions, top-level
define-valuescan be implemented as macros.
- Optimization is usually kernel, but Common Lisp and GHC can express some optimizations in user code. (This is a powerful and underappreciated capability.)
- ML's typechecking is kernel, because there's no way for user code to define it. In a language which allows user-defined program analyses (are there any?), it could be library. (This feature would fit nicely with user-defined optimizations.)
There are really two orthogonal distinctions here: kernel vs. library is about whether the compiler needs to know about the feature; primitive vs. derived is about how it was implemented. However, derived kernel is usually impossible (what, are you redefining
eval?), and the distinction from primitive kernel is not very interesting, so I don't bother.
The distinction between kernel and library is often more important than that between library and primitives. In these cases it's convenient to lump primitives and library together.
Some features require runtime but not compiler support, like fixnums or weak pointers or unboxed numeric arrays. These form a class (“runtime features”?) between primitives and kernel. I didn't include this class in the list above because I think it's less important, and can usually be lumped in with kernel or primitives.
I'm inconsistent about whether “primitive” refers only to features that must be built in, or to all those that are built in to a particular implementation.