Network Algorithmics

One of the things I’ve been doing while I have time off is catch up on my reading. Between looking after Amy, household chores, optometrist and dentist visits, etc. I’ve been reading bits and pieces of two books. One is Natural Capitalism, which is certainly thought provoking even though its conclusions might be a bit misguided, and the other is George Varghese’s Network Algorithmics. So far I’m only up to chapter eight, and it has been a mixed bag. For one thing, the author makes lots of untrue claims about Fibre Channel, which he persistently misspells as FiberChannel. On the other hand, much of the material should be familiar to readers of my own server design article, and it’s nice to see a book-length treatment of such an important but often-neglected technical area. Chapter five on avoiding copies and chapter six on avoiding context switches should be particularly familiar. I think Varghese mostly Gets It, though I suspect he’s a better techie than he is a writer because he often seems to start with a good point and then lose his way in subsequent levels of elaboration. For example, at the start of chapter eight he’s trying to explain why “early demultiplexing” … basically the idea of examining an incoming packet and putting it on a specific queue immediately instead of leaving it on a common queue where it might block other more important packet … is a good idea. Among his justifications are ease/efficiency of implementing protocols in user space, and early checking of things like packet lengths. I’ll address each of these in turn.

Varghese is obviously fond of user-level protocol implementations. I’m not. Perhaps the difference lies in the environments in which we’ve worked. User-level protocol implementations can work very well in “closed” environments where one has full control over what activities occur on the system and what users have access. Unfortunately, these advantages are rapidly overshadowed in a more “open” general-purpose environment. To see why, consider resource management. In a closed system, one can exercise a certain amount of preemptive control over resource usage. Analyses can be done under a finite set of circumstances, determining resource-allocation rules that can be used to determine optimal allocation strategies and/or to set rules that programmers within the system must live by. In an open system such analysis, specialization and rule enforcement are impossible. Protecting users from one another in such a system is mandatory, even though it saps performance and increases code complexity. As it turns out, also, many systems one might think of as closed (e.g. Revivio’s) are really open because they rely on technology from an open-system world and/or integrate multiple third-party components that lack the kind of design coherence that Varghese seems to assume.

On the second point, of prechecking lengths, Varghese refers to one of his “15 Principles Used to Overcome Network Bottlenecks” which are sort of an inverse of my “Four Horsemen of Poor Performance” and are ubiquitously referred to in the book as Pn (where n is the rule number).

For example, rather than have each layer protocol check for packet lengths, this can be done just once in the spirit of P1, avoiding obvious waste.

I object to this suggestion on two counts. First, combining these checks requires that the low-level code know the packet formats for every protocol above it. Ick. This is not just an aesthetic “layers are pretty” objection either; in an open system the complete set of upper-layer protocols can not be known until run time. Enabling upper-layer code to “teach” lower-layer code about packet formats both constrains those formats and introduces massive amounts of API complexity. The second objection is that Varghese is applying the wrong rule. Instead of applying P1, Varghese should be applying P2b (evaluate lazily). Under load, it’s likely that a fairly high percentage of packets will get dropped, and precalculating anything for about-to-be-dropped packets is a waste. One might argue that this is a violation of the principle of optimizing for the common case (which I know from Hennessy and Patterson) but it’s actually an application of another rule that doesn’t make Varghese’s list.

Always know what you’re optimizing for.

Most people should be optimizing for the high-load case because the low-load case mostly takes care of itself. For example, nobody’s going to care if your web server handles a single request 10% faster if everything that the web server does is less than 1% of the network latency anyway. They might care, however, if your server handles high load 10% better so they only need to provision nine servers for an application instead of ten. When designing to maximize throughput, you must always think in terms of the characteristics of the system under load, not at idle, and they’re often very different. Interestingly, this is where Varghese could have backed into the correct answer because he phrased his P11 as optimization of the expected case rather than the common case. High load is the expected (if not common) case, and under high load packet drops are frequent. Thus an application of P11 followed by P2b suggests that we should avoid precalculation in this case, which I believe is the correct answer.

So I think Varghese got this one wrong. Big deal. It’s still (so far) a good book on a subject near and dear to my heart. Experts disagree all the time, and even if I don’t get anything out of the book besides an opportunity to state my case on some of the issues Varghese raises it will have been worth it.

Picture Friday

In my never-ending effort to get the pictures I’ve taken up on the site, I’m just going to start posting new batches every Friday. Here are some pictures from our trip to Long Beach Island back in September, plus a couple from Drumlin Farm and Garden in the Woods. Hosting the pictures through Amazon is still new for me, so please let me know if you have any problems viewing them.

family at the beach Cindy, Amy, Jan and Bob at the beach.
Amy and the dinosaur A dinosaur-shaped slide at a playground in Manahawkin
rock climbing A plastic rock-climbing section of the same playground
looking cool Going for the aloof-movie-star look
sand chair A sand-chair I made for her
Drumlin Farm with Jeff At Drumlin Farm with Daddy…
Drumlin Farm with Bob …and with Grampy
Garden in the Woods This one’s from Garden in the Woods

Unintended Consequences

Of all the bad ideas to come out of that state lately, the California Supreme Court decision in Barrett v. Rosenthal has to count among the worst. Lots of people in the blogosphere are hailing it as a victory for free speech. Let me be very clear in stating that it is no such thing. The case revolved around a woman (Rosenthal) who reposted a defamatory screed written by someone else (Bolen) about a pair of doctors (Barrett and Polevoy). The court held that, according to 47 U.S.C. § 230, Rosenthal could not be held liable for the content written by Bolen, even though she had taken positive action to give it broader distribution. One aspect of the case had to do with a legal distinction between “publishers” who exercise some control over content and can therefore be expected to know a priori whether it’s defamatory, vs. “distributors” (e.g. bookstores) who cannot be expected to have such knowledge unless/until the defamatory nature of specific material is brought to their attention. In general, blogs and forums and such are considered distributors rather than publishers, and the fact that Rosenthal published Bolen’s letter unchanged further strengthened her case. The next issue was whether a distributor could be held liable for third-party content after its defamatory nature had been pointed out, or whether §230 provides immunity even then. A lower court had held that such liability does exist, but the CSC disagreed. According to this decision, continuing to publish defamatory material, even knowing it’s defamatory, incurs no liability. Worse, that immunity has already been used (by AOL, based on a prior and similar case) to resist even identifying who the real author is. In other words, just saying that somebody else wrote something is now an airtight defense against defamation charges, no matter how justified those charges might be.

This issue is particularly relevant to me right now, because it just so happens that I have recently become the target of a defamation campaign. So far I know of posts by three personae (probably only one actual person) on two sites, making all sorts of vile accusations against me and in one case against my wife as well. There is not a single grain of truth to any of these accusations, which are not even self-consistent, and they have all been made by cowards hiding behind pseudonyms. So, what can I do? The proprietors of the two sites – ProBoards and Squidoo – don’t seem to care that they are enablers of defamation and harassment. I guess being used as a legal shield by the worst kind of human scum isn’t a big deal to them, so long as there’s a profit to be made from user-provided content and no legal liability for the nature of the content from which they profit. Hosting smears is probably good business. This seems to be a concern that the CSC felt, but about which they considered themselves powerless to do anything.

We acknowledge that recognizing broad immunity for defamatory
republications on the Internet has some troubling consequences. Until Congress
chooses to revise the settled law in this area, however, plaintiffs who contend they
were defamed in an Internet posting may only seek recovery from the original
source of the statement.

Never mind that the original source cannot practically be discovered, I guess. Quoting an earlier decision, they also had this to say.

The court noted that another important purpose of section 230 was “to
encourage service providers to self-regulate the dissemination of offensive
material over their services.”

Never mind, either, that providers seem uninterested (at best) in doing any such thing. Protections of free speech have never been construed as applying to abuses such as fraud and libel. If you provide an avenue for those things, you become responsible either for the content itself or for identifying its origin (so far as is possible). To shirk both kinds of responsibility is morally unacceptable. To all those who think this decision is a victory for free speech, I have a question: what about my right to express myself and make statements that I have every reason to believe are true, without becoming the target of coordinated harassment? Spurious defamation lawsuits are a problem. Initiating a race to see who can spread the most vicious slander, without any accountability, is not a solution to that problem. Maybe the CSC upheld the law, but if it denies people like me any relief then it’s a bad law. I challenge anyone to say honestly that they’d still feel it’s a good decision if they were in my shoes.

Ironically, if the not-so-good folks at ProBoards or Squidoo object to what is written here, they’re out of luck too. Somebody else wrote this and I’m merely reposting it, so they can just go swallow some razor blades.

Programmers’ Status Hierarchy

I know I’ve seen things like this before, but I just had to to one of my own. This diagram is a status hierarchy for programmers (and assorted others), in which an arrow from one box to another means that occupants of the first box look down on occupants of the second. There are some more notes down below.

status hierarchy

Note that some of the “looks down on” relationships are mutual. In addition, they all tend to be transitive to some degree. If A looks down on B and B looks down on C, then it’s a pretty good bet that A looks down on C even more – unless A doesn’t even realize or acknowledge that C exists or is separate from B. For example, most sales and management types don’t really recognize embedded systems programmers as such; we’re all just programmers to them. I left off Lisp/Scheme programmers for the simple reason that they directly look down on everyone else and I didn’t want to clutter up the diagram with that many arrows.

Moving On

Many of you probably know this already and have been wondering about the lack of a blog entry about it, but for at least as many more it’s news. I’ve quit my job at Revivio, and today was my last day. I’m going to take a much-needed break for a while, and spend more time with Amy so Cindy can take a break too. It has been a long four years at Revivio, particularly during the last few months when I was trying to handle one full-blown crisis after another so that other people could actually get some work done. There’s a lot more that I could say, but very little that it would be wise for me to say right now, so I’ll wait until my thoughts on that gel a little. Instead, I’ll focus on the other shoe, which should drop right about … now.

In January, I’ll be starting my new job at SiCortex. No, I didn’t pick them because it’s the only name I could find worse than Revivio. They’re making a new kind of supercomputer, with a focus on low power consumption and massive internal bandwidth. At 5832 processors it’s not small, but a quick glance a the Top 500 list shows that it’s far from the largest either. The just-announced Green 500 is more representative of their goal. By the G500′s current (and admittedly over-simplified) metric, the SiCortex systems should weigh in at around 190 MFlops/W – good enough for first place with room to spare. It’s also a bit cheaper than the competition. I’ll probably be working on a bit of everything since it’s a small software team, and I’m sure I’ll have more to say soon enough. Suffice it to say that there are all sorts of challenges for someone like me to tackle, and I look forward to doing it.

Hey, AccordionGuy!

http://accordionguy.blogware.com/ says:

ERROR: This blog is on hold because its bandwidth has been exceeded. Please contact your blog provider.

Time to walk down the hall and get someone to stop being silly?

Election Thoughts

I had expected that there would be a deluge of other people writing about the election, so I wasn’t going to bother, but I’ve actually been rather surprised by how quiet everyone has been. Also, It Affects You seems to be down (domain expired), so I’ll inflict my thoughts on people here. First, let’s look at what happened yesterday.

  • The Democrats took control of the House.
  • The Senate is at 50D vs. 49R with one independent and two contested races. In both of those races, the Democrats – Webb in Virginia and Tester in Montana – have the upper hand. If those results hold, and if Joe Lieberman keeps his (generally unreliable) word to caucus with the Democrats, that gives them control of the Senate as well.
  • In Massachusetts, Democrat Deval Patrick is our new governor, and we won’t be able to buy wine at grocery stores.

The interesting one of these is the Senate. Personally I think it’s ridiculous that the Democrats have to get 51 Senate seats to gain control. Of course, I also think it’s obscene that the way committees and so on work makes having the majority so important, but that’s a topic for another day. Nobody has a majority of seats in the Senate, so what do we use as a tiebreaker? Giving that power to whoever holds the executive branch seems like a ridiculous idea. Were the people who wrote the Constitution just tired by the time they got around to considering such possibilities, so that they just couldn’t be bothered thinking it through? Personally I think this is a good opportunity to let the people of D.C. have a voice in the Senate. Failing that, the party with either a plurality of seats or a majority of the popular vote in the most recent election – Democrats in both cases, this time – should win the tie. Oh well. Just another “charming anachronism” in our non-system of government.

Patent Claims

This probably won’t seem all that funny to people who’ve never had to do patent stuff, but I thought it was pretty hilarious. Here, via TechDirt, is one of the claims from a patent.

“9. The method of providing user interface displays in an image forming apparatus which is really a bogus claim included amongst real claims, and which should be removed before filing; wherein the claim is included to determine if the inventor actually read the claims and the inventor should instruct the attorneys to remove the claim.”

Ouch. I’ve actually been doing something similar with specs for years. I always put in at least one totally irrelevant or outrageous statement in each spec, just to see if people actually read them. I’ve even put in footnotes identifying them as such, and a couple of times even offered rewards for the first person to notice, but to no avail. There are some things that you just can’t get people to read, even when it affects them.

Collected Sayings of Chairman Amy

Here are a few of the funnier things Amy has said recently.

  • A while back, Amy was playing with dust motes in a shaft of sunlight. I asked if she was trying to catch the sunlight, and she seemed to like that phrase. More recently, she came up with a variant of her own: “Dropped the sunlight.”
  • Upon seeing the moon while heading to the car: “Moon on.”
  • When I was having trouble with a puzzle: “Daddy wrong.” (I’m sure I’ll hear that one again.)
  • Semi-randomly: “That’s nice” “I feel better” “Daddy says ‘MIPS’”

I’m pretty sure there are a couple more that I’ll remember at some point, but not right now. I’ll add them in comments as they come to me.