One of the things I’ve been doing while I have time off is catch up on my reading. Between looking after Amy, household chores, optometrist and dentist visits, etc. I’ve been reading bits and pieces of two books. One is Natural Capitalism, which is certainly thought provoking even though its conclusions might be a bit misguided, and the other is George Varghese’s Network Algorithmics. So far I’m only up to chapter eight, and it has been a mixed bag. For one thing, the author makes lots of untrue claims about Fibre Channel, which he persistently misspells as FiberChannel. On the other hand, much of the material should be familiar to readers of my own server design article, and it’s nice to see a book-length treatment of such an important but often-neglected technical area. Chapter five on avoiding copies and chapter six on avoiding context switches should be particularly familiar. I think Varghese mostly Gets It, though I suspect he’s a better techie than he is a writer because he often seems to start with a good point and then lose his way in subsequent levels of elaboration. For example, at the start of chapter eight he’s trying to explain why “early demultiplexing” … basically the idea of examining an incoming packet and putting it on a specific queue immediately instead of leaving it on a common queue where it might block other more important packet … is a good idea. Among his justifications are ease/efficiency of implementing protocols in user space, and early checking of things like packet lengths. I’ll address each of these in turn.
Varghese is obviously fond of user-level protocol implementations. I’m not. Perhaps the difference lies in the environments in which we’ve worked. User-level protocol implementations can work very well in “closed” environments where one has full control over what activities occur on the system and what users have access. Unfortunately, these advantages are rapidly overshadowed in a more “open” general-purpose environment. To see why, consider resource management. In a closed system, one can exercise a certain amount of preemptive control over resource usage. Analyses can be done under a finite set of circumstances, determining resource-allocation rules that can be used to determine optimal allocation strategies and/or to set rules that programmers within the system must live by. In an open system such analysis, specialization and rule enforcement are impossible. Protecting users from one another in such a system is mandatory, even though it saps performance and increases code complexity. As it turns out, also, many systems one might think of as closed (e.g. Revivio’s) are really open because they rely on technology from an open-system world and/or integrate multiple third-party components that lack the kind of design coherence that Varghese seems to assume.
On the second point, of prechecking lengths, Varghese refers to one of his “15 Principles Used to Overcome Network Bottlenecks” which are sort of an inverse of my “Four Horsemen of Poor Performance” and are ubiquitously referred to in the book as Pn (where n is the rule number).
For example, rather than have each layer protocol check for packet lengths, this can be done just once in the spirit of P1, avoiding obvious waste.
I object to this suggestion on two counts. First, combining these checks requires that the low-level code know the packet formats for every protocol above it. Ick. This is not just an aesthetic “layers are pretty” objection either; in an open system the complete set of upper-layer protocols can not be known until run time. Enabling upper-layer code to “teach” lower-layer code about packet formats both constrains those formats and introduces massive amounts of API complexity. The second objection is that Varghese is applying the wrong rule. Instead of applying P1, Varghese should be applying P2b (evaluate lazily). Under load, it’s likely that a fairly high percentage of packets will get dropped, and precalculating anything for about-to-be-dropped packets is a waste. One might argue that this is a violation of the principle of optimizing for the common case (which I know from Hennessy and Patterson) but it’s actually an application of another rule that doesn’t make Varghese’s list.
Always know what you’re optimizing for.
Most people should be optimizing for the high-load case because the low-load case mostly takes care of itself. For example, nobody’s going to care if your web server handles a single request 10% faster if everything that the web server does is less than 1% of the network latency anyway. They might care, however, if your server handles high load 10% better so they only need to provision nine servers for an application instead of ten. When designing to maximize throughput, you must always think in terms of the characteristics of the system under load, not at idle, and they’re often very different. Interestingly, this is where Varghese could have backed into the correct answer because he phrased his P11 as optimization of the expected case rather than the common case. High load is the expected (if not common) case, and under high load packet drops are frequent. Thus an application of P11 followed by P2b suggests that we should avoid precalculation in this case, which I believe is the correct answer.
So I think Varghese got this one wrong. Big deal. It’s still (so far) a good book on a subject near and dear to my heart. Experts disagree all the time, and even if I don’t get anything out of the book besides an opportunity to state my case on some of the issues Varghese raises it will have been worth it.