Defending against nonexistent problems

When do good programmers write bad code? There are a great many ways to do that, of course, and most of them aren't very illuminating. But there are some common syndromes that account for a lot of poor code. One is when the programmers are writing in a language they don't know very well. They can't use its less guessable features, and since they aren't sure what its pitfalls are, they guess, and end up defending against ones it doesn't actually have. For example, I often see C++ code like this:

if (somepointer)
    delete somepointer;

This is unnecessarily explicit - delete is defined to do nothing if its argument is null, precisely so you don't have to check. (And every implementation I've ever used has done this correctly.) It should be just:

delete somepointer;

But someone new to C++ won't know that unless they've learned it explicitly. Until then, a reasonable programmer may continue to defend against the obvious mistake of deleting null.

Programmers who switch between related languages are prone to infelicities like this:

FILE *f = NULL;
f = fopen(filename, "r");

This is quite reasonable in C, because there might be (or might previously have been) something between these two lines - and in most dialects of C, declarations have to precede statements, so you can't always combine the two. C++, like recent C dialects, lacks this annoying restriction, but a C programmer won't necessarily know that.

They should know better than to read lines from a file like this, though:

char line[MAXLINELENGTH];
while (!feof(somefile)) {
  memset(line, 0, MAXLINELENGTH);
  if (fgets(line, MAXLINELENGTH, somefile)) {
    //do something with line...
  }
}

instead of the shorter equivalent:

char line[MAXLINELENGTH];
while (fgets(line, MAXLINELENGTH, somefile)) {
  /* do something with line... */
}

It's easy to understand why someone might write the longer version. The C standard library is full of functions which are difficult or impossible to use safely (gets, for instance), or which require unnecessarily tedious checks when calling them (malloc, free). Someone familiar with C, but not with the details of fgets, might reasonably doubt it would work correctly without the explicit feof and memset. (And once you've learned to fear uninitialized buffers, it may be easier to memset everything than to think about whether it's necessary.) But fgets is actually well designed. It always null-terminates the buffer, and its return value is well-behaved at EOF. Neither of these defensive additions is necessary.

In fact, the shorter version of that loop actually has slightly better error behavior: if some error (loss of network connectivity?) prevents fgets from reading the file, but doesn't cause eof, the longer code might loop forever, while the shorter one will stop. So in this case the defenses have made things worse, although it's unlikely to matter. I suppose the truly paranoid answer would be to call ferror too...

I'm not complaining that the authors don't know the language they're writing in. It would be nice if they did, but in practice it's often necessary for people to work in languages they don't know very well, and in that situation they'll inevitably work around limitations in their knowledge the same way they work around limitations in the language. This will usually be more efficient than scouring the documentation for shortcuts that might exist. Most of the time their code will still work, and if they keep using the language, they'll learn the details soon.

But sometimes they're unwilling to learn. I've seen all three of these bugs in C++ code written by experienced C++ programmers - people who should really know the commonly used parts of the language by now. Most programmers will simply learn these features when they're pointed out, but some don't want to. They find excuses (sometimes rather contrived ones) to discount the new information and keep programming the way they were. I don't understand what's going on here. Are they accustomed to feeling confident in their skills, and don't want to disrupt the situation by venturing into unfamiliar territory? But how does someone become a programmer without being generally curious about the subject?

No comments:

Post a Comment

It's OK to comment on old posts.