Canned Platypus

Making the world better, one byte at a time.

Sep
30

Designing Interfaces

OK, it’s geek time again. This time the topic is how to write code that can withstand change. It’s very common to have one function X call another function Y with a pointer argument which is expected to be non-NULL. Those of you who only program in languages without pointers can skip most of this article, though the general statements at the end might still be of interest. Those who don’t even program probably left already. ;) The question here is: how should the “contract” between these two functions be defined and enforced?

There are basically three options in this situation.

  • Y should always check its argument, and terminate (return with an error, throw an exception if that’s an option) if the pointer is NULL.
  • Y should have an assertion of the form (p != NULL).
  • X should only call Y if it knows the pointer is non-NULL (e.g. if it just checked).

I’ve listed the above in approximate order of preference. The debate between assertions vs. return values vs. exceptions is a complex one, with each answer being correct for some circumstances, so you can flip the first two around if you want. What should be incontrovertible, though, is that the last option belongs at the bottom of the list. If a function does anything that’s at all useful, it’s highly likely that the number of calls to it (and the number of programmers responsible for those calls) will increase over time. If this part of X’s calling convention is not enforced, some day someone will forget or not know that they shouldn’t call it with a NULL pointer. If you’re lucky, this will be detected early in development or testing and get fixed before the code gets released. Otherwise, your program will crash in the field. It’s never a good idea to rely on luck, especially when the consequences could be severe. Assuming that your tests are so good that they’ll catch everything is just hubris, and equivalent to relying on luck.

This coding rule (brought on, BTW, by fixing a bug that involved the SG driver, QLogic HBA driver, and asynchronous I/O in Linux) is just one example of a more general one:

Interfaces should minimize burden on the caller. An interface that just requires a function call is generally preferable to one that requires a function call plus something else.

In this case the “something else” is ensuring that the pointer argument is non-NULL. In another case it might be requiring that some lock be held, or interrupts disabled, or some resource already acquired, or any other kind of event either blocked from happening or scheduled to happen. Obviously you can’t follow this rule in every instance; the core of good software design is knowing when to violate one rule for the sake of another. For example, if you need to enable making two calls back to back without a “window” in between where a lock is not held, it’s reasonable to require that callers acquire the lock. In general, though, it’s a good idea when you’re designing your code to do so with an eye toward making it easy for others to call it correctly.

Comments

  1. I don’t think your rule is good for general application. Sometimes it’s better for the callee to make stronger requirements of the caller. It’s certainly never okay for the callee to silently allow inconsistency to spread — it must either weaken its preconditions (which means making an interpretation of the caller’s behavior into an acceptable precondition) or else loudly and explicitly signal otherwise such as with an exception — but I don’t think “weakest possible preconditions” is the right general rule.

    For a conrary view which influenced me, see Bertrand Meyer’s writings on “Why not to use defensive programming” in _Object Oriented Software Construction, Second Edition_.

  2. I donâ??t think â??weakest possible preconditionsâ? is the right general rule.

    As long as it’s understood as a guideline rather than a firm commandment, why not? Obviously it doesn’t apply in all cases, such as the locking scenario I mentioned, but if all else is equal why put the burden on the caller? Performance is rarely an adequate justification unless the check in question is expensive and API clients often make several calls to functions which repeat the same checks – and both are indications that the API was defined at the wrong level anyway. One approach I’ve often seen used, and occasionally used myself, is to have a simplified version of an API for general use and a more annoying (but also more efficient or flexible) version only for use by higher-level APIs. Both end up calling the same code to do the actual work, of course, but only the former follows the “least burden” rule. In the very common scenario where the choice is between “if (p != NULL) do_something()” in the caller and “if (p == NULL) return error” in the callee, though, I think the rule applies quite well.

  3. Well, I’m still mulling it over, but I think the problem is that “weakest possible precondition” means converting some error conditions into different error conditions, instead of loudly failing. Of course, “strongest possible precondition” means converting some non-error conditions into loud failures — which is I think what you have experienced. Although since it was so hard for you to debug I guess it wasn’t loud enough.

    There’s also the consideration of conventions. People expect to be held to some preconditions, e.g. if I pass you a data structure it is required to be internally consistent, or if I ask you to pop from a stack the stack is required to be non-empty. They don’t expect to be held to others — e.g. if there is an optional thing then I assume I can pass a NULL pointer instead of being required to pass some sort of “yes I know that this is optional and I promise it is consistent with the rest of the structure” sort of thing.

    Hm. I just realized that I misunderstood what you said — I just re-read your post and you are saying that the callee should do an assertion or exception, instead of silently corrupting stuff. I had thought you were saying that the callee should interpret the NULL pointer as being equivalent to some sort of “safe” value and continue. Sorry to read so sloppily.

Leave a Comment