The Kautz Graph

I’ll bet nobody, except some people I’ve told about it before, could identify what this is.
string art
Give up? OK, it’s a string-art representation of three disjoint Hamiltonian cycles in a degree-3 diameter-3 Kautz graph. A what? Unfortunately, a Kautz graph is not an inherently easy thing to grasp or describe, and the Wikipedia article does a particularly lousy job. (Yeah yeah, I know, I could write my own, but I’ve had enough of Wikipedia politcs already.) We have some decent descriptions that we’ve used for presentations at work (it’s the topology used in our machines), but what’s on the website doesn’t seem that great. So, here are some of the basics.

A degree-N Kautz graph consists of nodes which each have N input and N output links (arcs, edges, whatever terminology you prefer) arranged in a particular order. Specifically, each node has a number expressed in base N+1, such that no two adjacent digits are the same. The node’s output links are defined by shifting its number left and filling on the right with one of the three eligible digits; its input links are defined by shifting right and filling on the left likewise. So, for example, node 132 (in a degree-3 graph) will have output links to 320/321/323 and input links from 013/213/313). Why on Earth is any of this interesting or useful to anyone but a graph theorist? Because it turns out that Kautz graphs have a lot of interesting properties. For example, a degree-3 diameter-6 Kautz graph guarantees that any two nodes out of 972 have three completely separate paths each involving no more than six hops between them. For the same number of nodes, a hypercube would require ten hops and a torus fifteen. That’s why SiCortex uses Kautz graphs to connect nodes, with the SC072/SC648/SC5832 being degree-3 networks of diameter 2/4/6 respectively. The number of nodes in a Kautz graph of degree k and diameter d is (k+1)*kd-1. For k=3 and d=6, that’s 4*35 or 972, times 6 cores per node = 5832, hence the name of the system.

Kautz graphs are interesting to work with. For one thing, it’s almost impossible to conceptualize which nodes are connected to which others, through which paths. Can you tell me how to get from node 3101 to node 0321? Uhhh. Well, there’s always one obvious path formed by simply shifting in the digits of the destination one by one (in this case 3101 to 1010 to 0103 to 1032 to 0321), but there may be shorter paths and by the way good luck finding the other two that are guaranteed to exist. In practical effect, the connections and routes seem just as random as those in some arbitrary collection of nodes on the internet. Then there’s the issue of tiling a Kautz graph, or breaking it down into topologically-equal pieces so (for example) you can make a diameter-4 or diameter-6 system using the same pieces. That’s some of SiCortex’s secret sauce, but I don’t need to worry about giving away our methods because I don’t know them and even if I did I doubt that I’d understand them. Basically we just have a few guys who are really smart and have studied this a lot, and they seem to know how to do it.

Of course, none of this explains why I went to the effort of creating a string-art manifestation of a Kautz graph. Mostly it’s just because I felt like the versions on slides and such were all a bit disappointing. There’s just something about this version’s physical reality that makes it more compelling. As it turns out, actually wrapping the strings around the posts made me more acutely aware of some of the Kautz graph’s strange symmetries than I had been before. There was also a bit of an intellectual challenge involved. Pretty early I hit on the idea of dividing the links into three cycles and using different colors to represent them, and then I had to figure out how to find those cycles. In fact, I spent more time writing the program to do that than I did physically building the thing, and one of our local Kautz gurus was actually a bit surprised to find that there even were three disjoint Hamiltonians in a diameter-3. Lastly, of course, I did it because it’s unique. I’m pretty darn sure nobody else has one, and in today’s mass-produced world the idea that I have something that would not exist if I hadn’t made it myself is pretty satisfying.

UPDATE: one of the aforementioned smart people offered the following corrections and graciously gave permission for me to include them here.

In the second paragraph, you say that any pair of nodes will have three separate paths with no more than 6 hops between them. In fact, the first (shortest) path has no more than 6 hops, the second typically has 7, and the third in a few cases requires 8 hops.

All paths can be thought of as shifting in the destination node number. Short paths arise when the trailing digits of the origin node number correspond to leading digits of the destination node number, so fewer shifts are required than the diameter.

The alternate paths to a destination arise by shifting in a “wrong” digit (one which does not take you on the shortest path to the destination), followed by shifting in the correct digits of the destination address.

In a way, that underscores the point I made about how hard it is to conceptualize Kautz graphs. Consider the “shifting in a wrong digit” part. It’s fairly obvious that doing so will give you a new path, but considerably less obvious that the paths thus derived will be totally separate from one another. Kautz graphs are full of little surprises like this, which is what gives them their charm.

Snow Pictures

Continuing backwards, here are some pictures of the snowbanks that built up around the house earlier this month. The first one is from December 14, after the first big snowstorm.
Amy in a snowbank
The others require a little explanation. The day after the first snowstorm, I didn’t completely shovel the driveway before I went to work, figuring that the weather pattern wouldn’t last and it would thaw soon enough. That’s a pretty reasonable assumption in early December – most years. This time, though, we got another snowstorm almost on the heels of the first, and then another a week later, and haven’t had a good thaw until the last week or so. The pictures are from December 23, after things had thawed enough that I could attack the packed-down snow. It came up in big manhole-sized chunks, and just for fun I decided to make ramparts out of them in the snowbank.
ramparts I ramparts II
This last one actually precedes the previous two, but it’s easier to explain in this order. With the heavy snowfall immediately followed by a thaw, it seems that a lot of people were having trouble with ice dams and huge icicles. After I had pried loose a few of the larger ones I stuck them in the snowbank, where they formed the basis/inspiration for my subsequent “art project”. These are about 9-10″ at the base and nearly 2′ long, and one of them is still there – having outlasted the ramparts which have been mostly obliterated by the rain. The lone survivor is much thinner now, but as the snow has receded it actually seems taller than before.

Christmas Amy

I’m way behind on pictures, so I’m going to try something different. I’m going to work my way backwards, starting with yesterday, and just do small batches of pictures (e.g. those taken on one day) for a while. When I get all caught up I’ll resume with the rest of the Christmas pictures. So here’s just one from before I had to dump the camera’s memory card to make room for the rest.
Christmas 2007 Amy

The Night before Crash-mas

I’ve seen many versions of this before, but I think the following version from friend and coworker Jim Michaud is the best yet.

Twas the night before Christmas and all through IT
Who knew this night, would soon turn to sh**!
No systems were hung, for we have taken much care
and nagios would tell me if a problem was there
The computers were clustered all snug in their rack
Unaware there was a problem with their Ethernet jack

I had enjoyed a big meal and a refreshing nightcap
and was deeply enjoying a long winter’s nap
When next to my head, my phone made a clatter
I sprang from my bed to see what was the matter
Away to my office I flew like a flash
Tore through logins to try and mitigate the crash

I was needed at the office, and there I will be
Parking was easy, no one here but me.
When what to my wondering eyes do I see
but a “Login Timeout”, bad news for me!
With a login that hangs but the machine is not dead
I knew in a moment, I might be over my head

More rapid than normal my fingers did fly
And brought up the daemons to figure out why?
Now network, Now syslog, Now cyrus and portmap
On ssh ! On postfix ! On apache and LDAP
From single user mode in a login shell
but ssh logins still hang……what the h***?

As the RAID mirror rebuilds, I have time to waste
So back to my house to get food without haste
Then back to the office with no ideas that stick
Hoping the mirror rebuild would just do the trick

And then in desperation to bring up our email
I tried different hardware, but to no avail
As I rebooted the server, with a fair bit of doubt,
Same error as before, “Login Timeout”
But I did get a clue, but only just one
the kernel had a errors for UDP checksum

My heart lept with joy that email would soon send
but the switch showed no errors, another dead end
My eyes getting blurry, hard to see the screen
It was just about time to get a jolt of caffeine

I tried a new NIC, in case that’s the cause
and I don’t mean the saint or jolly mister Clause
It proved somewhat useful as the system surely ran
Until a reboot brought me back to where I began

I changed back the hardware as it probably was fine
So reboot again, what seemed like number ninety-nine
The system was perfect in single user mode
But turn on networking and things just got hosed.

I tried many things but nothing would work
Then had an epiphany, I’m such a jerk!
By plugging the cable into another switch jack
The logins succeed without even a “hack”
I sprang to desk to check my email
and just as I thought, they came without fail

It’s now four AM; twenty hours was this plight
Happy Christmas to all, and to all….zzzzzzzzzzzz

Hi Mom

I have a very special new blog-neighbor today: Out of Your Abyss. I call the author Mom, but the rest of you can call her Ruth.

Stable APIs

Over on Greg Laden’s somewhat hyperactive blog, I got into a little bit of a debate about whether or not stable APIs (Application Programming Interfaces) are a good thing. That led to a second post, in which Greg L cites Linux guru Greg Kroah-Hartman on the subject. Since Greg L had explicitly asked my opinion about what Greg K-H has to say, I’ll respond here and link back to there. First, let’s deal with a little bit of Greg K-H’s introduction.

Please realize that this article describes the _in kernel_ interfaces, not the kernel to userspace interfaces. The kernel to userspace interface is the one that application programs use, the syscall interface. That interface is _very_ stable over time, and will not break. I have old programs that were built on a pre 0.9something kernel that still work just fine on the latest 2.6 kernel release. That interface is the one that users and application programmers can count on being stable.

What Greg K-H is talking about is therefore actually an SPI (System Programming Interface). Normally it would be OK to use the more familiar term API anyway, but in the interests of clarity I’m going to adopt strict usage here. What I want to point out here is that most of the supporting-legacy-code bloat which Greg L cites as a major liability for Windows is not in the SPI that Greg K-H is talking about. All that stuff about code to support old applications or whatever is in the API, and there’s plenty of the same sort of cruft in Linux’s API for the same reasons.

Moving right along, what Greg K-H seems to be arguing is not that a stable SPI is worthless in and of itself but that the need for one is largely subsumed by having code be part of the main Linux kernel. He’s actually right in a particular context, but I don’t think it’s really the proper context for three reasons.

1. There are things out there besides PCs
If the code in question is a driver for a consumer-grade Ethernet card that dozens of kernel hackers are likely to use in their own machines, then most of Greg’s reasons for getting code into the kernel make sense. Other developers probably will be able to add features or fix bugs, or make SPI-related changes, to your code, and then test those changes. On the other hand, if you’re working on code for a cutting-edge high-end system that uses one of the less common CPU architectures, with an unusual and notoriously tricky memory-ordering model, with a lot of other features that bear little relation to what you might find in your desktop PC, it’s a bit different. Other developers might well be able to make useful changes for you. They also might screw them up in very hard-to-debug ways, particularly if they bring PC assumptions with them to something that’s not a PC. They certainly won’t be able to test their changes. In the end, getting your code into the mainline kernel won’t reduce your maintenance burden a whole lot.

2. Everyone has to play along
It’s one thing to say that you’ll play along and get your code into the kernel, but many projects nowadays involve drivers and patch sets from many sources. Getting your own patch A and/or patch B from vendor X into the kernel might not help you all that much if vendors Y and Z decline – for whatever reason, good or bad – to play along by doing likewise for patches C through G. Sometimes they want to do it, but their patches are rejected by the Kernel Gods for what might be legitimate technical reasons but are just as often petty political ones. Either way, you’re still going to be stuck with complex multi-way merges cascading outward any time any of those components moves forward, either because the vendor did something or because someone else changed part of the SPI. In other words, you don’t really gain much until that last vendor joins the club. By contrast, every attempt to preserve SPI compatibility brings its own immediate gain, even if other parts of the SPI change.

3. It doesn’t scale
“One kernel to rule them all and in the darkness bind them” solves some problems, but bundling every bit of device- and architecture-specific code with the kernel has its costs too. How many millions of hours have been wasted by developers configuring, grepping through, and diffing against PA-RISC and s390 code even though they’ll never use anything but an x86 on their project, just because they do those commands at the whole-tree level and never got around to automating the process of pruning out irrelevant directories every time they sync their tree? Even worse, how many times has someone who thought of making a change balked at the idea of applying it to every file it affects? How many times have they forged ahead anyway, but then gotten bogged down dealing with bugs in a few particularly intransigent variants? How much time has been wasted on the LKML flame wars that result? Preserving every architecture and device in one tree can have the same chilling effect on development as preserving every past SPI version.

If you’re living in the same desktop-and-small-server universe that Windows lives in, and you don’t have to deal with development partners who can’t or won’t get their own patches in the kernel, then getting your own code into the kernel might seem to obviate the need for a stable SPI and bring other advantages besides . . . this year. Down the road, or if those constraints don’t apply to your project, that might not be the case. SPI instability is a bad thing in and of itself, even if the pain doesn’t seem too great or there are other reasons to endure it. As I said to Greg L, it’s not something that makes Linux better than Windows. It’s an artifact of where Linux is on the adoption curve, quite reasonably more concerned about attracting new users than about alienating old ones. As Linux climbs that adoption curve, perhaps to a point of becoming a dominant operating system, I think that calculus will change. Breaking SPI compatibility is sometimes justified, but it’s almost never anything to be proud of.