Style identifies authors

Yossi Kreinin points out a rarely-mentioned use of stylistic variation: personal style is a signature that helps identify who wrote a piece of code.

I find it easier to understand programmers' intents through their unique style. When they're all forced to write superficially similarly, I can't tell who wrote what, and what the subtext of the code is.

I'll illustrate the last point with a couple of examples. I knew O.M. before I ever saw him and before I even knew his name. To me, he was the programmer with the two spaces before the trailing const:

inline int x()  const;

I knew him through his code: mathematically elegant, obsessive about fine details of type-based binding and modeling. I could guess what he left out with an intent to maybe add it later. I understood him.

Likewise, I can always spot G.D.'s code by the right-leaning asterisk:

int *p,*q=arr+i;

G.D. certainly couldn't care less about types - similarly to most people with this asterisk alignment. I know his code: terse, efficient, to the point. I know what to expect.

I'd never thought about it, but I do this too. I learn individual styles, and use the identity of the authors to help me understand their code.

One of the main ways I use this is to adjust my credence in mistakes. If I see seemingly unnecessary infrastructure, like an interface with a single implementation, in code written by an overzealous practitioner of OO, then it probably really is unnecessary, so I can safely ignore it; if it was written by a better architect, then I wonder what purpose they had in mind and whether it's still needed. If I see a series of seemingly redundant fcloses, it helps to know that the author was someone whose sloppiness drove them to paranoia, because then I won't waste time looking for a good reason. If I see duplicated code from someone who's averse to creating abstractions, I can assume the duplication is unnecessary instead of poring over it to see what's different. But if I see complex, tangled code from a pedantic minimalist, I know I need to find out why. Knowing the author lets me prune unlikely lines of investigation to focus on the important questions.

(I was going to include concurrency issues as another example, but on reflection I think I don't make much use of authors for this. When dealing with shared state, I'm similarly paranoid regardless of whose code I'm reading, because it's easy for anyone to get wrong. Knowing the author only helps with mistakes some authors wouldn't make.)

It also helps with interpreting comments. Knowing the author's preferred terminology and abbreviations makes telegraphic comments and names less mysterious: is sz short for “size” or Hungarian notation for “string”? Knowing how they think, what they know and what they consider worth mentioning also helps, as it does in interpreting any communication.

Identifying authors is particularly easy in C++ or Perl, because they have a lot of stylistic choices with no obvious right answer, so there's lots of room for individual variation. In languages with fewer choices and strong stylistic traditions, like Java, it's harder. Lack of variation is traditionally supposed to be a good thing, on the grounds that it's noise obscuring the signal of programs. But if it serves to identify authors, maybe it's not noise after all.

1 comment:

  1. I'm the guy who indents right braces an extra time and still uses hard tabs.

    ReplyDelete

It's OK to comment on old posts.