It's bloat all the way down

High-level languages are wonderful to program in, but they do offend my aesthetic sense in one way: their implementations are complicated. All those valuable features - libraries, garbage collection, runtime compilation, continuations and so on - take complexity and space, even for programs that don't use them. They're worth it, of course. But they are an affront to perfectionism, because they bloat the distributed forms of programs.

It's tempting to believe that low-level languages don't have this problem, that they're an efficient paradise, where there is no overhead but what a program inflicts on itself, where everything is possible, even if nothing is easy. It is a myth, of course. Brian Raiter's tiny ELF executables show that there's bloat even in trivial C and assembly programs, because of how code is packaged. By stripping out most of this bloat, and then abusing ELF, he managed to shrink a trivial C executable by 98%.

This is impressive, but depressing, because it shows that even at this level, there is arbitrary waste. And so it is everywhere - in hardware, in network protocols, in the problems computers are used to solve. No level of abstraction is a bloatless Utopia, but fortunately we can pretend they are, because the imperfections of one level don't greatly affect the level above.

(Via Randy Owens, who pointed out that one of Brian's tricks, overlapping data, is also used by some very small viruses.)

1 comment:

  1. I feel the same way. I have the habit of seeing through abstractions. It's how I've learned to understand programming - from an operational viewpoint. Very few language implementations really care too much about the low-level details. Heck, some of them compile to C, how lazy is that? I wouldn't go as far as deliberately abusing ELF format - I would go with minimal valid.

    It turns out not to matter much since memory and cycles are plentiful. But it still makes me cringe :-(

    One funny thing I found is that GCC inserts a comment section into every object file. And when you link your program you get the same comment duplicated over and over again from each object file. This is the default behaviour! I always use 'strip -R .comment' to get rid of it.


It's OK to comment on old posts.