There are many ways to design programs. Two of the best known represent opposite extremes: flowcharts tend to represent only control flow with little regard for data, and notations like UML (in its most common usage) tend to represent only data with little regard for flow. Flowcharters are a dying breed, but the industry is awash with UML weenies who develop frighteningly detailed models of their data without being able to conceptualize any non-trivial control flow which affects that data. Personally, I’ve found that the most important pieces of any workable design usually have to do with the space in between – in how control flows affect data, in particular with regard to state information. Typically a system has objects to represent various resources, requests, and components. Each of these has some actual data associated with it, plus links to other objects, plus some state information in the form of enumerated states and/or flags. The amount of state information directly affects the predictability and maintainability of a system, since every state or combination of states has to be handled and tested. Worse, the state of the system is the product of its component states, potentially leading to an exponential explosion, so one of the design rules I’ve learned is to keep states to a minimum. Another important rule is to make state explicit rather than implicit. In what follows I’ll cover some examples of implicit state, why they’re bad, and how explicit state is preferable.

My first example is one where the state of an object isn’t really implicit but might as well be because it’s hard to determine. Most programmers have a tendency to maintain each piece of object state in a separate variable, as with a network-connection object that has one flag to indicate whether it’s connected, another to indicate whether it has data pending, another to indicate whether an error has been detected, etc. Whether these separate pieces are maintained as separate fields or as bit definitions within a common “flags” field is less important than the fact that they all change independently. It’s not an absolute rule, but in general it’s better to combine flags into discrete states. From a debugging perspective, it’s easier to look at one datum instead of several. From a design perspective, enumerated states make it easier to track state transitions and avoid/eliminate bad ones. If a certain combination of flags should never occur but does, there’s often a problem with one piece of code looking at one flag and another piece of code (often written by a different programmer) looking at the other, and the two mashing things up even worse than before. With discrete states, the combination can’t even be represented. Ideally, either the code that would have set the flags inconsistently or one of the two pieces that would have looked at them will fail with a very obvious “invalid state” message. As a general rule, if you’re facing a plethora of flags you should probably combine them into state variables, and if you have a plethora of states or state variables you should probably stop using one object to represent multiple conceptually separate things.

The most common kind of true implicit object state is that which is embedded in a thread’s execution context (i.e. the stack). One of the things that really annoys me when I’m debugging is when I can’t find out what’s happening to objects by looking at the objects themselves but instead have to get stack traces for every thread to see which is operating on what. Stack crawling is more work than data dumping. It’s not just an inconvenience in debugging, though. Consider how a component in a highly available system responds to a fault elsewhere in the system, or how a component in just about any dynamic system might respond to a configuration change. Usually this involves waiting for some requests to complete, aborting others, freeing resources, etc. How will the component know what to do with each object if the object itself doesn’t contain enough state? Do you want your fault-handling code to crawl other threads’ stacks (which are still changing) to figure out what to do? Of course not. If an object is in use by a thread, that fact should be noted within the object itself, along with enough information for the fault handler to synchronize with that thread as necessary.

This brings us to the absolute worst kind of implicit state, which is state that is not recorded either within an object itself nor within any thread associated with the object. How can that happen? Imagine that you have a request object, and the request is waiting for some external event – either a per-request event such as a reply to a secondary request sent over a network, or a global event. When that event occurs, the object will “wake up” again and go through its next state transition, but while it’s waiting there’s no information in the object itself about what it’s waiting for and it won’t appear in any thread’s stack. Obviously it has to be found somehow when the event for which it’s waiting completes, but that might be by matching an ID contained in the object against one contained in a network message; the “missing state” is actually in a network packet somewhere else right now. This is an absolute nightmare from a maintainability standpoint, and should be avoided at all costs. If an object is in a state where an operation on it is pending, the nature of that operation and any information about other parties involved in it should be contained within the object itself. The difference between looking at an object and being able to say “this is waiting for an X from Y” or having to look back through logs/traces (which might not go back far enough) to see where an operation was initiated is the difference between a product that works and one that fails in mysterious impossible-to-debug ways.