Stylistic workarounds

Some C programmers carefully include a comma on the last element of an array initializer:

int tau_digits[] = { 6,
  2, 8, 3,
  1, 8, 5,
};

Similarly, some Lispers put an extra newline at the end of any list that might grow:

(defun execute-instruction (ins vm)
  (declare (type instruction ins) (type vm vm) (optimize speed))
  (case (opcode ins)
    (nop)
    (constant (write-register vm (destination-register ins) (literal ins)))
    (add (write-register vm (destination-register ins) (+ (read-register vm (source-register-1 ins)) (read-register vm (source-register-2 ins)))))
  ))

Some C++ programmers write member initializers like this:

Class::Class()
: field1(arg)
, field2(arg)
, field3(arg)
{ /* ... */ }

The rationale, in all three cases, is to make diffs clearer. A very common change to such lists is to add new elements, usually at the end. If they're written in the usual way, the diff will show the comma added to the previous item, even though there's no substantive change to that line:

int tau_digits[] = { 6,
   2, 8, 3,
-  1, 8, 5
+  1, 8, 5,
+  3, 0, 7
 };

If that line is long, it may not be obvious that it hasn't changed:

 (defun execute-instruction (ins vm)
   (declare (type instruction ins) (type vm vm) (optimize speed))
   (case (opcode ins)
     (nop)
     (constant (write-register vm (destination-register ins) (literal ins)))
-    (add (write-register vm (destination-register ins) (+ (read-register vm (source-register-1 ins)) (read-register vm (source-register-2 ins)))))))
+    (add (write-register vm (destination-register ins) (+ (read-register vm (source-register-1 ins)) (read-register vm (source-register-2 ins)))))
+    (subtract (write-register vm (destination-register ins) (- (read-register vm (source-register-1 ins)) (read-register vm (source-register-2 ins)))))))

This is a weakness of our diff tools: they operate on lines, not characters or tokens or trees, so they overstate the scope of the change. Since it's not convenient to fix the tools, we change our programs to avoid the problem. So far this is all reasonable. But when such a workaround becomes a stylistic prescription, it's easy to forget where it came from, and how narrow the conditions that justified it were. If our diff tools improve (which is not unlikely, since so many people suffer this itch), so this workaround is no longer necessary, how long will we keep doing it? In twenty years, will programmers still be taught to format lists (of whatever sort) specially, even though it no longer serves any practical purpose?

What other common stylistic prescriptions are workarounds for bugs in tools?

3 comments:

  1. I don't think about the trailing comma in C as having to do with diffing. It's just easier to always have it there, so you don't end up adding more data on the next line and forgetting to insert the separating comma. It's the same principle as always wrapping if-else bodies in braces even if there's just one statement in them.

    ReplyDelete
  2. How about the use of double space after a period, which I believe is used by some tools like Emacs to recognize the end of a sentence?

    ReplyDelete
  3. Plain console/terminal diff is by no means the only kind of diff tool that's available.

    Graphical diff tools like Diffuse[1] and kdiff3[2] can clearly show you precisely what part of each line changed, highlighting just the changed parts.

    [1] - http://diffuse.sourceforge.net/

    [2] - http://kdiff3.sourceforge.net/

    ReplyDelete

It's OK to comment on old posts.