More Sad Things

injured possum
(picture from TreeHugger)

The news out of Australia has been better the last couple of days, but before that it was horrific. Look at the pictures of towering flames, of metal that melted because those flames were so hot, of towns that were in the path of that and are now essentially gone. Cars crashed or went off the road, or were simply overtaken, because they couldn’t escape the smoke and flames fast enough. Don’t look at picture #9 too long; it will break your heart. Now there’s suspicion that some of the fires were deliberately set, and recent studies from Australia itself showing that most arson is committed by firefighters.

The scary thing is the glimpse into the future that it provides. Whether you believe it’s because of climate change or careless development, the fire situation in Australia now will be the fire situation in California very soon, and elsewhere not long after that. If there’s one lesson we need to take from this, it’s that fewer people should be carving out little pieces of “empty” forest or grassland in which to live, and those who do should take great care to prepare for fire. Make sure brush is cleared away from the house, that you can deal with embers landing on your roof, that you have access to water, that you have a warning system and escape plan that work. Especially, make sure your kids know what to do, because you won’t have time to tell them during any kind of emergency.

Sad Things

Seeing a squirrel on the road, still conscious and moving but unable to move its back legs, is sad.

Seeing small children in front of a church, with a casket being carried up the steps, is even sadder.

Rumors of Bush’s Demise…

…have been greatly exaggerated.

A South African TV station erroneously broadcast that former US President George Bush had died during one of its news bulletins.

For three seconds ETV News ran a moving banner headline across the screen saying “George Bush is dead”.

The “misbroadcast” happened when a technician pressed the “broadcast live for transmission” button instead of the one for a test-run.

Bad grammar and bad interface design, all in one story. Exercise for the reader: rearrange the first sentence to remove the ambiguity regarding whether the erroneous broadcast or the death itself had occurred during the news bulletin. Followup question: who the heck put those two buttons next to each other?

Cindy also points out that, even if it wasn’t supposed to go on the air, somebody had to have typed that sentence. Didn’t Bush himself do something like that once? Why, yes. Yes, he did. It’s not quite as memorable as Reagan’s voice test, though. Sometimes private humor becomes something else when said in public.

Bashing Safely

The Bourne shell, named after Stephen Bourne, was the first UNIX shell which most people would recognize as such. It had displaced the earlier Thompson shell; it was itself eventually replaced by Korn shell and then by bash (“Bourne Again SHell). There was an entirely separate lineage starting with csh, but it’s largely irrelevant here. I’ve used both tcsh and zsh in my day, but nowadays I use bash because it’s what everyone else uses and that’s important when writing scripts for a multi-person project.

The problem with bash is that there’s so much of it. Just about everyone knows about basic variable usage and pipes and redirection. Anybody who has written a medium-sized script probably adds things like basic if/for/while control structures, functions, “here documents” and command substitution with `command` or $(command). Then there are things like:

  • Different kinds of variable assignment, such as ${foo:-bar}
  • Different kinds of “here documents” and relatives – substituting vs. non-substituting, < << and <(command)
  • The “trap” command
  • Arrays
  • …etc.

I’m sure some of those few still reading could prove how “elite” they are by pointing out how they know about all of these and some other features I didn’t mention, how they’re not all that hard to master, and so on, but none of that matters. Let’s say the four “advanced techniques” above are each worth 25 points on a quiz covering every variant and subtle trap for the insufficiently attentive. I know a lot of programmers who could score 80% or better, but it’s a different 80% or better each time. I’ve seen people who would score 95% break scripts because they weren’t thinking about the difference between # and @ when they made a change, or tripped over some oddity in the order that a half-dozen kinds of quote/parameter/other substitution are applied to each line. This is a bad thing.

        read x_gw x_if x_lcl x_rmt < <<${routes[$my_route_index]}

In case anyone's wondering, that's the fragment of bash code that inspired these musings. Give yourself a pat on the head if you understand it without referring to a manual . . . then ask yourself whether everyone in your group, or every possible maintainer, would. I don't think I've ever seen a formal coding standard for bash, but maybe there's a need for something that distinguishes a safe and almost universally intelligible subset vs. a larger subset that is neither. Or maybe we need a shell whose syntax is more like real languages that had things like arrays and functions and real quoting/scoping rules to start with. I don't mean Perl, obviously.

Sunday Pictures 2009-02-09: Christmas ’08

I’ll do video next week.

cousins Amy with cousins (from left) Oliver, Eli, and Frances, plus Uncle Jeff, on Christmas Eve.
Amy with her Baby Alive™ doll. It eats and pees. It even comes with its own potty and diapers. It was the only gift Amy asked for specifically, and she was in love with it for the first several days.
quiet moment Sarah and Frances during a lull.
siblings Eli and Frances on the day after Christmas.
dress-up Eli wearing one of Amy’s dresses, and an unusual (but not bad) kind of smile from Amy.

They Say Hearing is the First to Go

Yesterday some people were talking about “noise canceling headphones” at work. I misheard that as “voice canceling headphones” and then realized I’d really like some.

Random Observations

I’m sure glad I wasn’t on Route 2 east this morning. I came in on 2A, going west, joining 2 in Concord. Route 2 was completely blocked up all the way to 126, and still heavier than usual to the next exit beyond that. I’d guess that the real problem was on 95/128, which means the backup was at least a couple of miles.

When the temperature is in the single digits (Fahrenheit), wearing short sleeves doesn’t make people think you’re tough, or cool. Oh, wait . . . yes, it does make them think you’re cool, but not in a “that guy’s so fashionable” sense. No, it’s more of a “that complete moron must be feeling cold right now” sense.

Dams and Earthquakes

From Discover:

The devastating earthquake that killed 80,000 people in China’s Sichuan Province last May may have been triggered by a recently built hydropower dam that lies only three miles from the quake’s epicenter, some researchers are arguing. The several hundred million tons of water piled behind the Zipingpu Dam put just the wrong stresses on the adjacent Beichuan fault, [says] geophysical hazards researcher Christian Klose

That’s a very scary possibility, considering all the other dam projects that are already underway (or even completed). I guess it just goes to show how interconnected everything is. Big projects like this affect many people, often in unpredictable ways, whether the power company is owned or merely supported by the state, and either way the people who are harmed deserve compensation from those who caused the harm. That’s true for China, and it’s true for the US, but it’s not clear whether either is prepared or inclined to ensure that the morally right things happen.

Storage Benchmarks Again

As usual, Sun is copying EMC in every way except profitability. Yet again, we have an employee making an extremely rare appearance in the blogosphere to express his own opinion which “just happens” to match corporate market positioning. Yet again, the position is against standardized, repeatable whole-system benchmarks and for inviting big-name companies (only) to undertake the weeks’ worth of work to test performance on the customer’s actual applications. Last time it was EMC’s storage oligarchist; this time it’s Sun’s Bryan Cantrill. Bryan actually does manage to make a few good points along with the bad, so I’ll address both.

The 2008 reaffirmation of the decades-old workload is, according to SPEC, “based on recent data collected by SFS committee members from thousands of real NFS servers operating at customer sites.” SPEC leaves unspoken the uncanny coincidence that the “recent data” pointed to an identical read/write mix as that survey of those now-extinct Auspex dinosaurs a decade ago

The similarity is not inevitable, but it’s hardly an “uncanny coincidence” either. Do people really use their file servers that much differently today than they did in 1998 or even 1986? No, not really. Most organizations still have lots of users banging on lots of files in lots of different ways. FISHworks guy focuses on read/write balance as evidence of something fishy about these numbers, but I’ll explain why his analysis is misguided in a later paragraph. Personally I find the data/metadata balance a bit more suspect. It’s my impression that file sizes have increased faster than file counts, which would imply that data operations have become relatively more important, but perhaps my impressions are skewed by having worked in HPC for the last couple of years. Certainly there are still many applications (particularly email servers) that tend to create too many tiny files instead of fewer larger ones, and metadata performance has become a hot issue in filesystems lately – as Bryan should damn well know. I’d certainly appreciate more disclosure from the SFS committee about how and where these numbers were gathered, but I’m not going to dismiss them as “uncanny coincidence” just because the results don’t favor my employer (who BTW has no horse in this race).

DRAM sizes have grown by nearly five orders of magnitude (!), and client caching has grown along with it — both in the form of traditional NFS client caching, and in higher-level caching technologies like memcached or (at a larger scale) content distribution networks. This caching serves to satisfy reads before they ever make it to the NAS head, which can leave the NAS head with those operations that cannot be cached, worked around or generally ameliorated — which is to say, writes.

Yes, yes, we’ve known that pretty much since before Bryan started using computers, if not before he was born, and such knowledge was pretty integral to the design of ZFS. Yet again we see a pundit displaying a very selective kind of ignorance, or at least trying to leave readers selectively ignorant, when suppressing or ignoring certain facts favors their argument.

the first fatal flaw of SPEC SFS: instead of making the working set a parameter of the benchmark — and having a result be not a single number but rather a graph of results given different working set sizes — the working set size is dictated by the desired number of operations per second. In particular, in running SPEC SFS 3.0, one is required to have ten megabytes of underlying filesystem for every operation per second. (Of this, 10% is utilized as the working set of the benchmark.) This “scaling rule” is a grievous error, for it diminishes the importance of cache as load climbs

Diminishing the importance of cache is not a bug; it’s a feature. Here, Bryan commits the ultimate newbie mistake of confusing performance with scalability. He wants a benchmark that measures performance on workloads that fit into cache. Customers want a benchmark that measures scalability with cache factored out. I’m dealing with a situation right now where a customer-mandated measure to factor out cache effects resulted in under-representing actual performance. Customers want real to- and from-disk numbers, which is not to say that they want to ignore the effect that controllers and interconnects have on delivering those numbers all the way to clients but that they want the numbers that can be sustained for indefinite periods at indefinite sizes. Furthermore, some of the very trends that Bryan tries to use in his favor when he talked about read/write ratios work against him here. The more you use things like content delivery networks and memcached, the more random and varied and cache-busting the workload at the server will be. A “data warmth” level – working set size divided by total size – of 10% is not at all unreasonable.

in my experience . . . those who are operation intensive seem to have smaller working sets, not larger ones — and those with larger amounts of data tend to focus on bandwidth more than operations per second.

Here Bryan’s actually right on the facts but self-servingly wrong on interpretation. People who have really large amounts of data do tend to be more bandwidth-focused. However, that doesn’t mean that we can just shrink datasets arbitrarily to match the cache on the device under test. Every operation still carries some data load, and when you’re talking about scalability rather than simple performance the total amount of data even for small-I/O workloads rapidly exceeds what any NAS device will have for cache.

Short-stroking 224 15K RPM drives is the equivalent of fueling a dragster with nitromethane — it is top performance at a price so high as to be useless off the dragstrip. It’s a safe bet that if one actually had this problem — if one wished to build a system to optimize for random reads within a 130GB working set over a total data set of 1.3TB — one would never consider such a costly solution.

Again Bryan’s right on the facts but self-servingly wrong on interpretation. Yes, short-stroking is wrong. Customers actually do engage in something a bit similar – I’ve been involved in many discussions around whether a customer should deploy N 900GB drives or 3*N 300GB drives – but they’re generally very sensitive to storage utilization levels. They would never accept the 4% storage utilization that NetApp used for the cited benchmark run, and in fact storage utilization and short-stroking have been quite contentious subjects among various vendors’ blog-representatives for a while lately. Pricing disclosure (which Bryan rightly suggests) would help, as would requiring a certain storage-utilization level just as a certain working-set-size level is already required.

What’s really funny about this, though, is that what Bryan seems to want – allowing data sets small enough to fit into cache – is absolutely not a solution to this kind of gamesmanship. If short-stroking is akin to running your dragster on nitromethane, then running out of cache is akin to running it on dilithium crystals. It’s just totally unrealistic and irrelevant to the measurements that customers actually want.

SPEC’s descent into a disk benchmark while masquerading as a system benchmark does worse than simply mislead the customer, it actively encourages the wrong engineering decisions.

Not wrong, just not good for Bryan’s baby. Maybe if Sun wants a “system benchmark” for a storage subsystem, as though that’s closer to what’s needed than a the controller (not disk) benchmark that is SFS, then maybe they should propose one. Do they? Nope.

After considering SPEC SFS (and rejecting it, obviously), we came to believe that the storage benchmark well was so poisioned that the best way to demonstrate the performance of the product would be simple microbenchmarks that customers could run themselves

I guess the reader’s supposed to forget about all of those microbenchmark shortcomings Bryan mentioned earlier. Now we’re pretty much back to suggesting that customers run non-repeatable non-comparable benchmarks themselves, with the one or two vendors that they can afford to do that for (and it is the customer doing the vendor’s work for them BTW). Yes, we all get that Sun wants to be EMC, but as much as they’ve copied the attitude they’ll never copy the results. They’re still more like the next DEC,

Sunday Amy, 2009-02-01: Dance Class

As promised, here are some pictures and video from “observation day” at Amy’s tap/ballet class in mid-December. I don’t have much to say about each picture/video individually, but as a collection they’re adorable. I had to try really hard to arrange my Illinois/Michigan trip so I could still do this, which made my travel arrangements even more awkward than they would have been otherwise, but it was worth it.

JPG image
H.264 video, 1.5MB
H.264 video, 1.6MB
JPG image
H.264 video, 4.4MB