Buzzword Bingo

In case you haven’t heard of it, here’s a description of the game. And here’s a particularly checklist-ridden pitch to Pointy Haired Bosses for one product, which should lead to a quick win on most cards.

Serena® TeamTrack® is a Web-architected, secure and highly configurable process & issue management solution that gives you control, insight and predictability in your Application Lifecycle and Business Process Management throughout the enterprise. Through creating repeatable, enforceable, auditable, predictable processes, TeamTrack has directly increased profitability for customers worldwide in software development, manufacturing, government, financial services, healthcare, and other industries.

Would you like to know the really scary part? As mind-numbingly crapulous as that description is, somebody was actually insane enough to copy it! Oy vey.

Nothing In Particular

Hi all, I know it’s been a while. Part of the reason is that I spent the first half of the week visiting a customer in Memphis. Not much to talk about there, except to whine about how I always seem to go places during their worst weather. The one time I’ve been to Atlanta was during an ice storm. In Memphis it was 17 Fahrenheit when we landed, and all the locals were saying it was the worst weather they’d seen in a decade. The building we were in actually didn’t have plumbing for one of the days we were there, debatably because of freezing or wet ground from melted snow but either way related to the weather and construction techniques that don’t take such conditions into account.

The other thought floating to the top of my mind is comment spam. It’s only part of the reason I’ve stopped posting anything on It Affects You (the other part being that I’ve just gotten tired of spending so much time arguing with jerks of a sort I’d actively avoid in real life) but it’s a big part. It’s now impossible to have a conversation there, mostly due to advertising for various kinds of crap at xorg.pl which is obviously a very spam-friendly host. A related phenomenon is “splogs” (spam blogs) which are blogs that contain nothing but fragments of text captured from other sites apparently for the sole purpose of catching searches for a particular word. Until recently, if you searched for “canned platypus” (in quotes) at Technorati many of the hits would be for such gobbledygook. Now you get nothing, because it seems like Technorati has gone a bit overboard fighting this problem and in the process made legitimate sites such as this one disappear. They get a big one-finger salute for helping the sploggers ruin things.

I’m seriously beginning to wonder whether it would be worth it to fight fire with fire on this one. A while back I found out about Sugarplum, which is a sort of honeypot for the email-harvesting programs that spammers use. It sends back bogus email addresses but, just as importantly, it does so slowly so that it ties up resources on the harvester for as long as possible. What if a bunch of people set up comment-spam (and trackback-spam) honeypots, which similarly tied up resources in the programs used by that type of spammer? In its basic form I sort of think such a thing might just contribute to the problem of useful content disappearing amongst all of the netnoise, but maybe some variant would keep the spammers busy without disrupting real users’ web experience too much. I’ve also thought about using some of the same botnet technology as the spammers themselves to perform DDoS attacks on the sites they advertise but I generally don’t condone such “doing evil to do good” approaches. Poetic justice aside, the “collateral damage” to legitimate users of the same hosts is likely to outweigh any benefit, and the law of unintended consequences says there are likely to be other victims as well. As I’ve written before, the distinction between methods and results is not as clear-cut as most people think, so trying to achieve good goals with bad methods is usually a mistake.

Intel’s Battery Life Problem

There have been a lot of stories recently about a problem with Intel’s new Core Duo processors and USB 2.0 devices causing sharply decreased battery life. Mostly this has been a big Who Cares for me, but a Microsoft quote from AnandTech’s story on the problem caught my eye.

Windows XP SP2 installs a USB 2.0 driver that initializes any connected USB device. However, the USB 2.0 driver leaves the asynchronous scheduler component continuously running. This problem causes continuous instances of memory access that prevent the computer from entering the deeper Advanced Configuration and Power Interface (ACPI) processor idle sleep states.

I might be stretching a bit here, but this seems to be exactly the kind of problem that could have been anticipated on the basis of USB’s fundamentally broken design. Yes, that’s right, I said it’s broken. I use USB devices all the time, I don’t have any problem with them, but at a technical level I still consider it a broken protocol because it relies so heavily on polling. A little history might be in order here to explain how this happened and why it’s a problem.

A long time ago, Apple developed Firewire, which became standardized as IEEE 1394 and is also known by other names (such as Sony’s i.Link). Firewire was pretty fast for its time, at 400Mb/s in an era when 100Mb/s Ethernet was still considered pretty zippy, and it also had some special features to support devices such as video cameras. Later, Intel introduced USB, which was significantly slower (at 12Mb/s) than Firewire but also supposedly cheaper to implement. Intel spread a lot of FUD about the “Apple tax” of about $0.25 per port for licensing Firewire technology, never mentioning the fact that USB didn’t really come for free either; its cost was just buried a little deeper in the chipsets and motherboards that supported it. Firewire and USB could have coexisted quite easily, each being suited to different types of devices, but then Intel got even greedier and decided to go after the Firewire market with USB 2.0 at 480Mb/s. That’s nominally just a bit faster than Firewire, which is not at all a coincidence, but in reality test after test has shown that the very same device equipped with both interfaces will perform better using the supposedly-slower Firewire (note how the Firewire numbers are still better than even the higher PC-USB numbers). The reason for that will soon become clear.

While USB2 increased the raw bit rate significantly over its earlier sibling, it also carried along some USB1 baggage. In particular, it’s still based on a conceptual model of a PC at the center of the universe connected to dumb peripherals. The PC is the only one that can initiate anything; if an event occurs on a peripheral device (e.g. data becomes ready after a disk head travels to the right track) it still has to wait for the PC to come along and ask about it. Firewire, by contrast, is more “peer to peer” in nature. If a device has a message to send, it initiates the process of sending it. This typically results in an interrupt on the recipient, much the same way that networking cards work. This reliance on polling in USB vs. asynchronous messages and interrupts in Firewire might seem dry and uninteresting, but it does have important implications. It means that you can plug two Firewire devices (e.g. a video camera and a monitor) together and have them talk without needing a PC to mediate. You can sort of do this with USB On-The-Go but that’s just a hack to have two devices negotiate who’s on top; the fundamental protocol remains as it ever was, and Intel only added OTG as an afterthought when savvy Firewire users pointed out this shortcoming. The USB approach also has performance implications, explaining the apparent anomaly mentioned above. Polling simply consumes more resources, both “on the wire” and either on the device or on the PC (or both). I’m sure Intel thought it would be great to have the CPU do the polling, just like they think anything which wastes CPU time and forces people to buy faster processors is a great idea, but even they have been forced to implement most of the low-level polling elsewhere (typically in the south bridge). Obviously, though, some of the higher-level stuff such as device initialization still can’t be done that way, which is why Microsoft’s USB2 driver has to poll all the time … leading at last back to the problem we started with.

The key point here is that all of these problems – reduced convenience, reduced performance, reduced battery life – stem from the same fundamental mistake of designing an asymmetric protocol that relies on polling instead of symmetric asynchronous messaging to do its job. How many times will this mistake be repeated, more often nowadays in software rather than hardware, before people learn?

Catching Overflows

If you thought my previous post showed how ugly computing could be, you ain’t seen nothing yet. Recently, I had occasion to look into how to catch integer overflows in kernel code. I won’t go into a lot of detail about why, except to point out that it takes a bit less than a month for a millisecond timer stored in a signed 32-bit integer to wrap around. It didn’t take long to discover gcc’s -ftrapv option, which generates signals if an overflow is detected, but there are a couple of problems with that. One is that enabling the option causes gcc to use functions for integer arithmetic operations, which is pretty seriously bad for performance. The other is that these functions don’t exist in the kernel (or embedded system) and neither do any of the functions to throw/catch signals. I get pretty tired of “the whole programming world is like the easy user-space stuff I do” attitudes sometimes, but that’s a rant for another time. I needed something that was reasonably efficient and could be used from the kernel. To do that, I had to resort to inline assembly, and gcc’s syntax for that has to be one of the ugliest things I’ve seen in computing. In any case, here’s what I came up with for x86 multiplication.

// Overflow-catching equivalent of "var *= num;"
#define MUL_OVF(var,num,lbl,dum) asm( \
    "imull %2\n"                      \
    "\tjno "#lbl"\n"                  \
    "\tint3\n"                        \
    #lbl":"                           \
    :"=a"(var), "=d"(dum)             \
    :"rm"(num), "0"(var));

Yeah, it’s pretty nasty, and it’s even nastier so it can be general; the lbl and dum arguments only exist so that I can use the macro multiple times within one function and use spare variables instead of making up new ones respectively. (I’d normally recommend creating a separate variable anyway for the sake of clarity, and letting the compiler’s register allocator deal with it, but when inline assembly is involved things get trickier.) Basically all this does is perform the multiplication and then trip a breakpoint if it overflowed. There’s a bit of a potential instruction-pipeline bubble because of the jump, so this doesn’t come for free, but it’s still a lot cheaper than calling a function. It would take a little more effort to write a similar macro that does something besides breakpoint, like perhaps call a function to print out a stack trace, but in a development environment the macro above (or similar for addition and subtraction) should be sufficient.

Hard Disk Upgrade

For a while now, my main desktop machine has been making a bit of a squeaking noise, which I quickly identified as the hard disk. When it first started happening I bought a new disk but then it just sat there for a few months until this past weekend. Adding a disk to an existing system is no big deal, but replacing one is a bit of a pain. Besides copying files to the new disk, you have to make sure the MBR (Master Boot Record) and partition boot records are set up correctly. Since I was replacing a 40GB disk with a 120GB I wanted to resize the partitions as well, so a simple block-for-block copy wouldn’t work. This created significant potential for error. The most annoying part was running GRUB to set up the new disk for booting, because I had this crazy idea that I wanted to test booting off the new disk before I physically replaced the old disk in my tight small-form-factor case and closed it up (until then the new disk was just kind of hanging out of an open case). GRUB had tons of options and even an interactive boot-time command line, so I figured I’d be able to prepare the new disk and then do a couple of test reboots to make sure everything was working. That simply didn’t work; when I tried booting from the new disk I either got a screen full of the word “GRUB” repeating over and over, or the system just kind of sat there. Occasionally I’d get “read error” or some similarly terse error message. That provided a bit of variety but really wasn’t very helpful. To a certain extent this was just the sort of system tweaking I had expected, but what made it annoying was the poor quality of the GRUB manual. Here’s the description of the “map” command.

Command: map to_drive from_drive

Map the drive from_drive to the drive to_drive. This is necessary when you chain-load some operating systems, such as DOS, if such an OS resides at a non-first drive. Here is an example:

grub> map (hd0) (hd1)
grub> map (hd1) (hd0)

The example exchanges the order between the first hard disk and the second hard disk. See also DOS/Windows.

OK, great. Now here’s the critical question: is the identifier for either from_drive or to_drive interpreted at the time GRUB is run to set up the disk, at the time the “map” command is issued during the boot process, or at the last moment when GRUB actually tries to boot an OS? If the user guesses wrong, they might do something that either doesn’t work or – far worse – modifies the wrong disk and leaves them with an unbootable system. I went through several iterations of this, always wondering whether I was trashing my system. Eventually I just gave up on the “map” command entirely, and instead modified “devices.map” (for which the documentation was even less useful) to lie about which disk was hd0.

Then I hit another snag. I’ve been dealing with SCSI and Fibre Channel for a while, and they’re generally far more complicated than IDE, but at least it’s familiar complexity. I naively thought that “hd0″ would always refer to the first disk that was found, but that’s apparently not the case. It always refers to the system’s “primary master” even if it’s not present, so my new disk was “hd1″ even if it was the only one in the system. This was probably behind most of my boot-testing failures. To do a real test, then, I not only had to physically disable the old disk (which I had been doing by removing its power) but I also had to move the jumper on the new drive so it would be master (which I hadn’t been doing). Once I realized that, I was quickly able to get first Linux and then Windows booting.

Do you want to know the really funny part? After all that, the noise is still there. It was probably one of the fans all along. I now have a larger and slightly faster disk, plus the old disk sitting in a USB/Firewire enclosure that I had also bought a while ago (this is a great form of backup even if it’s too expensive to do often), but my computer still squeaks.

Imported Content

As I mentioned a couple of months ago, I’ve been doing most of my political writing at It Affects You instead of here. Unfortunately, I seem to be the only one posting there and there have been some problems with comment spam and an absentee administrator. For those and other reasons, IAY actually seems to get fewer comments than here (especially if you don’t count people who read this site as well anyway). Therefore, I have imported all of my articles from there and created an automatically updating page to contain them. Enjoy.

Office Humor

I’ve heard that women don’t “get” this, but in my male-dominated corner of the industry it’s very common for camaraderie to be expressed in the form of mild verbal abuse. You’re not one of the gang unless you can take, and occasionally deliver, a good zinger. Or at least that’s what I tell myself when I’m on the receiving end of an over-the-cube-wall exchange like this:

Me: We have three Jeffs at this company now. That’s annoying.
Esteemed Colleague: One was annoying.

Blue Light Special

Maybe it’s just the time that I’m writing this, but the news that blue light helps keep people awake (via Digg) really got my attention this morning.