Sorry for the lack of updates, folks. I’m in the middle of a major heap of coding, creating some basic network-server infrastructure that I need for my project. The problem is a fairly common one:

  • Several modules are layered on top of one another in a multithreaded system, with requests being issued “downward” and results coming the other way.
  • The module issuing/forwarding a request does not know at the time whether the request will complete immediately or asynchronously, but must be prepared for either.
  • If a request is completed immediately, processing should devolve into a series of simple function calls, with each “borrowing” lock and other context from its caller as is the case with non-asynchronous code.

As I’m sure anyone familar with this kind of programming would know, reconciling these requirements is quite a challenge. What I’ve ended up with is a combination of universal structures for requests and queues, and a generic inter-module interface based on dispatch tables and queues. Sound familiar? It should, because several well-known systems have been based on a similar formula. The real secret ingredient here is the way it deals with reentrant callbacks. In a lot of similar systems reentrancy is not handled well. Many programmers resort to releasing and reacquiring locks to avoid reentrancy issues altogether; not only does that hurt performance, but it also introduces the possibility that state will change while the lock is released. Then they find that the callback needs some extra piece of context on the “return trip” so they add a private queue/table to associate that context with the reply – at the cost of more locking, more trips through the memory allocator, etc. Then there’s always that head-scratchingly weird bit of code to deal with the inevitable case where the information just mysteriously went missing. I’ve seen many people head down that particular road to insanity before.

Now contrast this with a system where the per-module context is closely bound with an extensible request structure, so that no lookup is necessary and there’s no possibility that the context won’t be found. In addition, the callback now has access to one crucial bit of information maintained by the system: whether it’s being called in the original call context or asynchronously from a different one. Obviously, the callback still needs to retake locks etc. in the latter case, but now it has readily available and accurate information that allows it to optimize the original-context case. None of this is earth-shattering stuff, but it adds up. Avoid a spurious lock/unlock here, a redundant lookup there, a few extra possible error cases…pretty soon it starts to look like an efficient and maintainable system where before everything seemed like it was barely held together with chicken wire and duct tape.