Clojure is almost as big as Common Lisp

How big is Clojure's standard library? Never mind the Java library, or even clojure.contrib; how much is built in to clojure.jar?

One way to answer this is to count definitions, in the form of public Vars:

(defn ns-size "How many public Vars does this namespace have?" [name]
  (require name)
  (count (ns-publics (find-ns name))))

(def builtins '(clojure.core clojure.core.protocols clojure.data
                clojure.inspector clojure.java.browse clojure.java.browse-ui
                clojure.java.io clojure.java.javadoc clojure.java.shell
                clojure.main clojure.pprint ;clojure.parallel ;deprecated
                clojure.reflect clojure.repl clojure.set clojure.stacktrace
                clojure.string clojure.template clojure.test clojure.test.junit
                clojure.test.tap clojure.walk clojure.xml clojure.zip))

In Clojure 1.3.0a6:

user=> (ns-size 'clojure.core)
563
user=> (reduce + (map ns-size builtins))
826

So Clojure is already approaching Common Lisp's 1064 definitions1. This measure probably overstates Clojure's size relative to CL, because it compares all of Clojure to only the standard parts of Common Lisp, but every CL implementation includes other libraries in addition to the standard. CL also tends to pack more features into each definition, through keyword arguments and complex embedded languages like format and loop, so counting definitions understates its size. On the other hand, many CL features are of dubious value, so Clojure may already have surpassed it in useful library.

If it hasn't, it will soon, because Clojure's library is still growing. The Var count has increased by 50% in the last two years:

VersionDateclojure.coreclojure.*
1.0.0May 2009458543
1.1.0Dec 2009502576
1.2.0Aug 2010546782
1.3.0a6Mar 2011563826

At this rate, Clojure 1.6 will have a bigger standard library than Common Lisp. (The timing depends on how quickly parts of clojure.contrib get assimilated into clojure.core.) I suppose when that happens, Common Lisp will still be perceived as huge and bloated, and Clojure as relatively small and clean. Any perceived complexity in Clojure will be blamed on Java, even though Java interoperation accounts for only a tiny part of Clojure's library. Scheme will still be considered small and elegant, no matter how big its libraries (or its core semantics) get. (R5RS Scheme has 196 definitions, and R6RS about 170, or 680 with its standard libraries. Racket is much bigger.) Traditional beliefs about language sizes are not very sensitive to data.

Update: riffraff on HN counts Python's definitions: len(__builtins__.__dict__.values()) ⇒ 143 definitions in the default namespace, plus len(set(chain( *[dir(o) for o in __builtins__.__dict__.values()]))) ⇒ 250 method names, so about 400 built-in definitions. There are also over 200 other modules in the standard distribution, so its full standard library is much bigger — the first 24 modules have sum([len(dir(__import__(n))) - 4 for n in "string re struct difflib StringIO cStringIO textwrap codecs unicodedata stringprep fpformat datetime calendar collections heapq bisect array sets sched mutex weakref UserDict UserList UserString".split()]) ⇒ 476 definitions, so the total is something like 5000. “Batteries included” indeed.

However, standard library size is less important than it used to be. When every language has an automatic library fetcher like Leiningen or Quicklisp or PLaneT, built-in libraries aren't much more readily available than other well-known libraries. The main obstacle to library use is no longer finding suitable libraries, or downloading them, but learning to use them.

1 Common Lisp has 978 symbols, but symbols ≠ definitions: some symbols have definitions in more than one namespace, and some have no definition. Common Lisp has 752 definitions in the function namespace (636 functions, 91 macros, and 25 special forms), 116 variables, 85 classes, 66 setf functions, and 45(?) type specifiers, for a total of 1064 definitions. There are about 30 symbols without definitions: declare and its declarations and optimization qualities, a few locally bound symbols like call-next-method, and eight lambda-list keywords. There's also plenty of room for disagreement about what counts as a definition.

3 comments:

  1. You've got a minor HTML error: your first block of Clojure code looks like <pre><code>...<code></pre> so you're not closing the <code> and almost the whole page ends up monospaced.

    ReplyDelete
  2. I think that one feature which helps Clojure "feel small" is namespaces. Since you can keep your working namespaces limited and clean, and you don't need to include any more than you need, your "standard library" is only as big as it has to be. You are free to omit nearly everything and (re)define names which exist in other parts of Clojure (even clojure.core).

    ReplyDelete
  3. Gareth: Fixed, thanks.

    J: Most of Clojure's library is in clojure.core, and it's unusual to shadow that, so I don't think Clojure is small in this way, while e.g. Python and Perl are. But I suspect much of the perception of smallness is really about the size of the mental model required to use the library, so more regular languages like Python and Clojure feel smaller than CL or Perl.

    ReplyDelete

It's OK to comment on old posts.