Trivial program checkers

Typecheckers get (and deserve) a lot of attention for their ability to find bugs, but their success leads people to think typechecking is the only way to check programs. It's not. There are useful program checkers much simpler than any typechecker. Here's an example:

grep scanf

This finds real bugs in real programs — and not just ordinary bugs, but security holes due to %s overflowing buffers.

Here's another checker:

grep 'printf[^"]*$'

This finds printfs that don't have a literal string on the same line, which usually means someone forgot the format string and did this:

fprintf(file, somestr);

...instead of this:

fprintf(file, "%s", somestr);

It's a stupid bug, yes, but not a rare one. I once ran this checker on a large application and found dozens of instances of this bug. I also found dozens of false positives, from things like these:

snprintf(somewhere->buffer, MAX_BUFFER,
         "format string", args);
fprintf(file, message_format_strings[status], description);

But they were obvious false positives, so it was easy to ignore them.

Here's an even less selective checker:

grep '(\w\+ \?\*)'  #beware different versions of grep

This finds pointer typecasts, which (in C++, more than in C) are often misguided — they might indicate unsafe downcasts, or non-type-safe containers, or casting away constness, or simple unnecessary casting. It also finds a great many false positives, of course — mostly function prototypes and innocent casts.

These checkers don't prove the absence of the errors they look for. A program that doesn't contain the string scanf might still call it via a library or by dlsym. The printf checker can be defeated by something as simple as a printf-like function whose name doesn't contain printf — hardly a rare occurrence! The cast checker misses mundane things like (char**) and (IntPtr). They only find bugs; they don't guarantee their absence.

They're also not very powerful. They find only certain specific errors, not a wide variety. A real lint program can do much better.

But when you don't have a real lint handy, or when your lint doesn't find the problem you're worried about, simple textual checkers can be valuable.

“They only find bugs”. “Only certain specific errors”. Faint criticism.

In addition to being useful, these checkers are a reminder that there are many ways to check programs. None of them are typecheckers in either sense — not in the common sense, because they don't check datatypes, and not in the type-theory sense, because they don't classify expressions. They aren't even aware of the existence of expressions — they see code only as text. This is not a very powerful approach, but it's enough to find a lot of bugs.

Not all checkers are typecheckers.

No comments:

Post a Comment

It's OK to comment on old posts.