Yak Electrolysis

While I was reviewing a patch yesterday, I found code that used lots of distinct directory names for a series of tests – test1 would use brick1 and brick2, test2 would use brick3 and brick4, etc. I’ve run into this pattern myself, and it can be a bit of a maintenance problem as tests are added or removed. For example, in the test scripts for iwhd, there were multiple occasions when adding a test led to accidental reuse of names, and much non-hilarity ensued (everything about that project was non-hilarious but that’s a story for another time). The simplest pattern to deal with this is something like the following, which I suggested in a review comment:

# Test 1
# Test 2

This works pretty well, but the inline manipulation of $sequence kind of bugged me so I tried to put it in a function. My first try looked something like this.

function next_value {
    echo $sequence

Yeah, I hear the laughter. For those who didn’t get the joke yet, this falls prey to bash’s handling of variable scope and subshells. The $(next_value) construct ends up getting executed in a subshell, so changes it makes to variables aren’t reflected in the parent and you end up with the same value every time. I really should have stopped there, satisfying myself with the original inline version. Sure, that version can still hit the scope/subshell issue, but only if you use functions in your own code and not as a side effect of the idiom itself. I realized that getting around the scope/subshell issue would involve something ugly and inefficient, which is why I should have stopped, but I was intrigued. Surely, I thought, there should be a way to do this in an encapsulated and yet robust way. The first idea was to stick the persistent context in a temporary file.

function next_value {
    prev_value=$(cat $tmpfile)
    echo $next_value > $tmpfile
    echo $next_value

OK, it’s kind of icky, but it should work. Again, I should have stopped there, but that temporary file bothered me. Surely I could do that without the file I/O, perhaps by spawning a subprocess and talking to that through a pipe. Yes, folks, I had embarked on a quest to find the most insanely complicated way to solve a pretty simple problem. The result is generator.sh and here’s an example of how to use it.

source generator.sh
start_generator int_generator 5 6
dir1=/foo/bar$(next_value 5 6)
dir2=/foo/bar$(next_value 5 6)

Doesn’t look too bad, does it? OK, now go ahead and look at how it’s done. I dare you. Here are some of the funnier bits.

# start_generator
ctop=$(mktemp -t -u fifoXXXXXX)
mkfifo $ctop || fubar=1

Yes, really. Not polluting the filesystem with a temporary file was part of the point here, but I ended up dropping not one but two orts instead. (Cool word, and yes, I did use a thesaurus.) To be fair, these are only visible in the filesystem momentarily before they’re opened and then deleted, but still. I tried to find a way to do this with anonymous pipes, but there just didn’t quite seem to be a way to get bash to do that right. Here’s the next fun bit.

# start_generator
$1 < $ptoc >$ctop &
eval "exec $2> $ptoc"
eval "exec $3< $ctop"

The first line invokes the subprocess, with input and output fifos. The two execs are the bash way to create read and write file descriptors for a file. They’re wrapped in evals to satisfy my goal of making things as complicated as possible by allowing the caller to specify both the generator function/program and the file descriptors to use. Eval is very evil, of course, so let’s play Spot The Security Flaw.

start_generator int_generator "do_something_evil;"
# ...causes us to eval...
exec do_something_evil;> $ptoc

I'm not going to fix this, because it's only an "insider" threat. This code already runs with the same privilege as the caller, and can't do anything the caller can't. They could also pass in a totally bogus generator function, and I'm not going to worry about that either because they'd only be shooting themselves. On to the next fun piece.

# next_value
echo more 1>&$1
read -u $2 x

Again, this is kind of standard bash stuff to write and then read from specific file descriptors. Having an example of this is one of the main reasons I didn't just throw away the script. With a little bit of tweaking, the same technique could be used as the basis for a general form of IPC to/from a subprocess, and that might be useful some day.

To reiterate: this is some of the craziest code I've ever written. It's way more complicated than other solutions that better satisfy any likely set of requirements, and the implementation threads its way through some particularly perilous bash minefields. FFS, I might as well have just used mktemp in the first place and skipped all of this. You'd have to be nuts to solve this problem this way, but maybe my documentation of the discoveries I made along the way will help someone solve a similar problem. Or maybe it's just a funny story about bash scripting gone horribly wrong.

Scaling Filesystems vs. Other Things

David Strauss tweeted an interesting comment about using filesystems (actually he said “block devices” but I think he really meant filesystems) for scale and high availability. I thought I was following him (I definitely am now) but in fact I saw the comment when it was retweeted by Jonathan Ellis. The conversation went on a while, but quickly reached a point where it became impossible to fit even a minimally useful response under 140 characters, so I volunteered to extract the conversation into blog form.

Before I start, I’d like to point out that I know both David and Jonathan. They’re both excellent engineers and excellent people. I also don’t know the context in which David originally made his statement. On the other hand, NoSQL/BigData folks pissing all over things they’re too lazy to understand has been a bit of a hot button for me lately (e.g. see Stop the Hate). So I’m perfectly willing to believe that David’s original statement was well intentioned, perhaps a bit hasty or taken out of context, but I also know that others with far less ability and integrity than he has are likely to take such comments even further out of context and use them in their ongoing “filesystems are irrelevant” marketing campaign. So here’s the conversation so far, rearranged to show the diverging threads of discussion and with some extra commentary from me.

DavidStrauss Block devices are the wrong place scale and do HA. It’s always expensive (NetApp), unreliable (SPOF), or administratively complex (Gluster).

Obdurodon Huh? GlusterFS is *less* administratively complex than e.g. Cassandra. *Far* less. Also, block dev != filesystem.

Obdurodon It might not be the right choice for any particular case, but for reasons other than administrative complexity.
What reasons, then? Wrong semantics, wrong performance profile, redundant wrt other layers of the system, etc. I think David and I probably agree that scale and HA should be implemented in the highest layer of any particular system, not duplicated across layers or pushed down into a lower layer to make it Somebody Else’s Problem (the mistake made by every project to make the HDFS NameNode highly available). However, not all systems have the same layers. If what you need is a filesystem, then the filesystem layer might very well be the right place to deal with these issues (at least as they pertain to data rather than computation). If what you need is a column-oriented database, that might be the right place. This is where I think the original very general statement fails, though it seems likely that David was making it in a context where layering two systems had been suggested.

DavidStrauss GlusterFS is good as it gets but can still get funny under split-brain given the file system approach: http://t.co/nRu1wNqI
I was rather amused by David quoting my own answer (to a question on the Gluster community site) back at me, but also a bit mystified by the apparent change of gears. Wasn’t this about administrative complexity a moment ago? Now it’s about consistency behavior?

Obdurodon I don’t think the new behavior (in my answer) is markedly weirder than alternatives, or related to being a filesystem.

DavidStrauss It’s related to it being a filesystem because the consistency model doesn’t include a natural, guaranteed split-brain resolution.

Obdurodon Those “guarantees” have been routinely violated by most other systems too. I’m not sure why you’d single out just one.
I’ll point out here that Cassandra’s handling of Hinted Handoff has only very recently reached the standard David seems to be advocating, and was pretty “funny” (to use his term) before that. The other Dynamo-derived projects have also done well in this regard, but other “filesystem alternatives” have behavior that’s too pathetic to be funny.

DavidStrauss I’m not singling out Gluster. I think elegant split-brain recovery eludes all distributed POSIX/block device systems.
Perhaps this is true of filesystems in practice, but it’s not inherent in the filesystem model. I think it has more to do with who’s working on filesystems, who’s working on databases, who’s working on distributed systems, and how people in all of those communities relate to one another. It just so happens that the convergence of database and distributed-systems work is a bit further along, but I personally intend to apply a lot of the same distributed-system techniques in a filesystem context and I see no special impediment to doing so.

DavidStrauss #Gluster has also come a long way in admin complexity, but high-latency (geo) replication still requires manual failover.

Obdurodon Yes, IMO geosync in its current form is tres lame. That’s why I still want to do *real* wide-area replication.

DavidStrauss Top-notch geo replication requires embracing split-brain as a normal operating mode and having guaranteed, predictable recovery.

Obdurodon Agreed wrt geo-replication, but that still doesn’t support your first general statement since not all systems need that.

DavidStrauss Agreed on need for geo-replication, but geo-repl. issues are just an amplified version of issues experienced in any cluster.
As I’ve pointed out before, I disagree. Even systems that do need this feature need not – and IMO should not – try to do both local/sync and remote/async replication within a single framework. They’re different beasts, most relevantly with respect to split brain being a normal operating mode. I’ve spent my share of time pointing out to Stonebraker and other NewSQL folks that partitions really do occur even within a single data center, but they’re far from being a normal case there and that does affect how one arranges the code to handle it.

Obdurodon I’m loving this conversation, but Twitter might not be the right forum. I’ll extract into a blog post.

DavidStrauss You mean complex, theoretical distributed systems issues aren’t best handled in 140 characters or less? :-)

I think that about covers it. As I said, I disagree with the original statement in its general form, but might find myself agreeing with it in a specific context. As I see it, aggregating local filesystems to provide a single storage pool with a filesystem interface and aggregating local filesystems to provide a single storage pool with another interface (such as a column-oriented database) aren’t even different enough to say that one is definitely preferable to the other. The same fundamental issues, and many of the same techniques, apply to both. Saying that filesystems are the wrong way to address scale is like saying that a magnetic #3 Phillips screwdriver is the wrong way to turn a screw. Sometimes it is exactly the right tool, and other times the “right” tool isn’t as different from the “wrong” tool as its makers would have you believe.

Confessions of a MOO Programmer

Most people nowadays seem to learn about object-oriented programming via Python or Ruby. Before that it was C++ or Java, and before that Smalltalk or (some flavor of) LISP. My first exposure to object-oriented programming was through none of these, but instead through LambdaMOO. Best known as a persistent multi-user environment, LambdaMOO also had its own fairly unique programming language. For one thing, it was prototype-based whereas most other OOP langages tend to be class-based. This means that you never have to create a class just so you could create one instance. You just create objects, which can be used directly or (sort of) as a class, or even both. There is no distinction between virtual and non-virtual methods, nor between static and instance methods, no abstract classes or singleton-pattern nonsense. You could build almost all of these things yourself, but you don’t have to conform to just one model.

However, the most interesting thing about MOOcode is its approach to permissions. Because it was designed to work in a multi-user environment, where users were often neither trusting nor trustworthy but still called each others’ code constantly, MOO needed a pretty robust permissions system. This was no lame private/protected/public model, without even the concept of an owner and thus without the ability to infer rights based on ownership. Each object, each “verb” and each property has an owner. The language includes primitives to determine the object for the previous call (caller), the operative permissions in that call (caller_perms), or even the whole call stack (callers). This lets you implement basically whatever permissions scheme you wanted. For example, “caller==this” is kind of like “protected” in C++, and “caller==#12345″ is kind of like declaring #12345 as a “friend” likewise. At the other extreme, one could go looking further up the stack to see if the current verb is being called in some particular context even if there are multiple “unexpected” calls in between.

The most unusual thing about the MOO permission system is that every verb runs with the permissions of its author – not the user who caused it to be invoked. This is kind of like every program in UNIX being “set UID” by default, which seems crazy but actually works quite well. It makes most kinds of “trojan horse” attacks impossible, for one thing. The person who has to worry about improper access to data is also the person – the verb author/owner – who can add code to prevent it. The exact workings of MOO property ownership and inheritance were a bit strange sometimes, but most MOO programmers learned the basics and were able to secure their code pretty quickly.

Because of all these features, programming in MOOcode was a very fluid and enjoyable experience. Python comes closest among the languages I know well, though I’ve dabbled with Lua and it seems even closer. If I ever decide to spend time on inventing my own language instead of using one to get something else done, it would probably be a prototype-based OOP language with extra features for concurrency/distribution, and in that context MOO’s permission model is about as good a starting point as I’ve seen. It’s too bad that most people who could benefit from studying it are probably put off by its origins in a game-like environment.

Extracting Data From Quassel

Earlier this morning I needed to reconstruct a conversation with someone on an IRC channel, and I don’t generally keep text logs of IRC activity. However, I do use Quassel, which maintains a substantial backlog for me, and I also happen to know that the backlog is stored in a SQLite3 database. I poked around a bit and figured out enough of the schema to extract the information I wanted. I’ll probably need it again so I turned my manual hacking into a script, and other people might need to do the same thing so I’m publishing the script. Here it is: convo.py. Enjoy.

When Coding Standards Hurt Quality

Most of my work is on code that has “initialize all local variables at declaration time” as part of the coding standard. I’ve never been a big fan, but I’m very reluctant to get into coding-standard arguments (probably as the result of having had to enforce them for so long) so I just let it go. The other day, Rusty Russell offered up a better reason to avoid this particular standard. The crux of the matter is that there’s a difference between a value being initialized vs. it being initialized correctly, and the difference is too subtle to define a usable standard. Sometimes there is a reasonable default value, and you want to initialize to that value instead of setting it in ten different places. Other times every value has a distinct important meaning, and code depends on a variable having one of those instead of a bland default. Does NULL mean “unassigned” or “no such entry” or “allocate for me” or something else? The worst part of all this is that required initializers prevent compilers and static-analysis tools from finding real uninitalized-variable errors for you. As far as they’re concerned it was initialized; they don’t know that the initial value, if left alone, will cause other parts of your program to blow up. If you need a real value, what you really want to do is leave the variable uninitialized at declaration time, and let compilers etc. do what they’re good at to find any cases where it’s used without being set to a real value first. If your coding standard precludes this, your coding standard is hurting code quality.

Rusty suggests that new languages should be designed with a built-in concept of undefined variables. At the very least, each type should have a value that can not be set, and that only the interpreter/compiler can check. This last part is important, because otherwise people will use it to mean NULL, with all of the previously-mentioned ambiguity that entails. The “uninitialized” value for each type should mean only that – never “ignored” or “doesn’t matter” or anything else. A slightly better approach is to make “uninitialized” only one of many variable annotations that are possible, as in cqual. Maybe some of that functionality will even be baked into gcc or LLVM (small pieces already are), providing the same functionality in current languages. Until then, the best option is to educate people about why it can sometimes be good to leave variables uninitialized until you have a real value for them.

My Android Experiment

A while back, I bought an Asus EeePad Transformer. A bit later I bought the base unit which has a keyboard and extra battery. I still love it. Being able to switch between landscape and portrait orientation on a whim is really awesome, because so many sites look so much better in one format. The battery life is phenomenal. I can get a full day of near-continuous use with the tablet alone, and the battery in the base is even bigger. It’s even smart about draining the base-unit battery to keep the built-in one as full as possible.

The last couple of days, I had to travel to NYC, so I decided to experiment with using this as my travel computer. I did bring another (small) laptop along just in case, but resolved not to use it – and I didn’t, except as a reserve battery for my MiFi portable wireless gadget. However, I have run into two serious limitations. One is that I can’t use it for presentations. Besides the fact that it physically has only a (mini) HDMI whereas most projectors are still VGA, I can’t find any decent software for presentations. I’ve looked at several apps and a couple of online services. They’re all awful. Is it too much to ask that a presentation program handle a simple two-level bullet list properly? Apparently. The other problem is that I can’t really use this thing for terminal sessions. The base-unit keyboard actually lacks an escape key. As a vi user, that’s crippling. I could use emacs instead, but the handling of the control key also seems a bit erratic. I tried using the on-screen escape key in ConnectBot, but eventually settled on using Hacker’s Keyboard instead. While I was able to get some work done (Gluster guys: that’s how I did the quorum-optional patch) it was certainly not very pleasant. I’d like to avoid solutions that require rooting the device, but I might have to resort to that.

I still love using the EeePad at home, and in meetings. I just might not be able to use it as a road machine and that makes me sad. Maybe the software situation will improve over the next year or so.

To The Cloud . . . And Beyond!

You might not have noticed, but I just moved. As part of my ongoing project to consolidate my various web “properties” I upgraded and updated my Rackspace cloud server (which I’ve been using for two years), and put nginx + php_fpm + mysql on it to serve my websites. It probably wasn’t the best idea to do the move on the same day I posted something as inflammatory as my last post – there was some virtual-memory tuning I’d forgotten, and I did get bitten by the uber-stupid “OOM Killer” under the Hacker News load – but it all seems to be working out otherwise. One of the nice things is that I can resize my server any time I expect a similar spike, then shrink it again when the spike’s over. If I were really motivated I’d do it all automatically, but I don’t have that kind of spare time.

So, as usual, please let me know if you see any glitches. One of the things the traffic spike did for me was show that normal stuff is working, but some stuff around the edges might still need tweaking. I know FTP access and image links to womb.atyp.us (in old posts) aren’t working. Anything else?

Why Buy RHEL?

Yet again, I’m going to post about something related to my employer. Yet again, I’m going to reiterate that this is not an official Red Hat position. In fact, I more than half expect I’ll get in trouble for saying it, but it just had to be said. You see, there’s a discussion on Slashdot about How Can I Justify Using Red Hat When CentOS Exists? The poster wants the functionality of Red Hat Enterprise Linux, but the CIO doesn’t want to pay for it and demands that they use CentOS instead. A lot of people have tried to explain the various aspects of what a RHEL subscription gets you. I’m not going to expand or correct those comments, because that will definitely get me in trouble and partly because I just don’t care. Here’s the reason that apparently carries no weight at all with CIOs and never even occurs to Slashdotters.

Because it’s the fucking right thing to do, you assholes.

Yeah, I used profanity on what has almost always been a family-friendly blog. I did that because it’s so utterly infuriating that such an obvious and important principle has totally escaped notice elsewhere. If you value something, you pay for it. Even the worst free-market zealots claim to believe that. They often use the same rationale to justify eliminating regulations (especially environmental ones) or replacing public aid with private charity. Red Hat folks do more work than anyone to improve the Linux kernel, GNOME, and dozens of other projects. They write the code, do the testing, fix the bugs, write the documentation, and provide all kinds of logistical support. The beneficiaries include not just obvious derivatives like CentOS and Scientific but even commercial competitors from Oracle and Amazon’s obvious clones to completely separate distributions like Ubuntu which also package that code and fixes. This work isn’t done by volunteers. It costs a lot of money. The fact that we allow the code to be distributed for free should have nothing to do with the principle that you pay for what you value. When you violate that principle you ensure that there will be less of what you value. The result will be a net loss for everyone, as less innovation occurs and more energy is wasted making sure everyone’s “intellectual property” remains under lock and key. Even the thieves lose.

I’d really like to hear from someone who can offer a better moral justification than “we can so we should” for using CentOS on thousands of machines without paying for even one RHEL subscription, because nothing I’ve heard so far is even close. “Duty to maximize profits” arguments will be deleted, because I’ve already turned that one into swiss cheese enough times in my life. Does anybody seriously believe that freeloading should be on the “good” side of our collective moral map?

Libertarian Watch

Alex Tabarrok has written what might very well be the stupidest thing I’ll read this year, about the Mexican Mafia. In it, he portrays their extortion as “taxes” because folks like him love to do the opposite and portray taxes as extortion. He takes it a little further than most, though, by claiming that the MM “became a kind of government” because some of their actions could be construed as protecting property rights or adjudicating disputes. Is that enough to make a government? Is it really equivalent to the torts and courts on which even the most free societies and markets depend? Does the MM provide anything equivalent to national defense – the one institution even the most radical government-haters seem to favor? No, they rely on prison guards, and beyond them the real military, for that. In fact, their whole enterprise depends on Real Government doing all the hard work of delivering victims by incarceration. Tabarrok concludes that the Mexican Mafia has “much to teach us about crime and governance” despite all this. I disagree. An unelected and unaccountable authority defined by ethnic homogeneity and engaging in “taxation” without representation would have no legitimacy as a government, and bears no resemblance to the one with which Alex is not so subtly comparing it. Even a meth habit doesn’t explain that kind of writing.

In other, slightly better, news, Radley Balko has finally figured out that the limited-liability corporation is really an exercise in political economy, and might not be truly compatible with libertarian ideals. Yeah, the “limited liability” part, unaccompanied by anything in return for that governmental favor, kind of gave that away. The corporate structure is to liability what an address in the Caymans is to taxation. Many people have recognized that for years. They’ve suggested that, if we’re going to break the relationship between profit and risk (which real free-market theory tells us is essential), we should at least try to limit or recover the losses that result. Do you suppose that whole careers spent attacking such people as socialist might explain why normal people see “libertarian” as nothing to do with free markets? Of course, the comments to Radley’s article make it quite clear that even asking an innocent question is viewed as heresy. Ours is not to question. Ours is only to accept our position below the New Aristocracy in Washington and Wall Street.

Ganz – Doing Security Wrong

Last week, my mother sent my daughter a gift – a “Mazin Hamster” from Ganz. It comes with a “feature code” that supposedly confers access to a special area of the Webkinz online world. No link; you’ll see why soon enough. The problem is that the hamster’s feature code by itself doesn’t give you access to the Webkinz site. For that, you need the “secret code” associated with a regular Webkinz animal first; then you can use the hamster’s code to get into the special area. Not having such a secret code, I set about procuring one. I went to eBay, found an auction for a cute little gecko with a sealed code attached, and quickly won the auction for far less than it would cost to buy a similar animal in a brick-and-mortar store. So far, so good.

When the gecko arrived, we tried to use its secret code to register on the website. I’m sure everyone can guess what happened next; we were informed that the code had already been used and thus was no longer valid. So here I am, in clear physical possession of both the toy itself and the associated card/ticket with a unique code printed on it, having provably paid for both, but as far as Ganz is concerned I do not own that code. Sometimes possession isn’t nine tenths of the law, after all. The first thing I did was contact the seller, who I will not name because I’m not really sure he did anything wrong. I was polite. I explained the situation, warned him that some of the “sealed” codes on toys he’s selling might not have been sealed in any useful sense after all, and sought his advice. As expected, he swore that the code had been sealed when he got the toy and when he sent it to me. He offered to send me a new code if he got one, but I have to say if I did get a code I could never shake the suspicion that it had come from some other kid’s toy. Having been disappointed twice, Amy was in tears by this point. I don’t much like the idea of merely causing yet another little boy or girl to cry, and I told the seller that.

My next step was to contact Ganz. The phone representative confirmed that the code had already been used, adding that it had been as far back as 2008 and even giving me the first name of who they considered the owner. The toy does appear brand new, in case you were wondering. I’ve seen plenty of these toys before. We even have one (sans code) already, and I can assure you that they don’t stay new-looking long after they get into the hands of a kid who would be interested in registering on the site. Phone Gal also informed me that they do not support sales via Amazon or eBay, only from physical stores or their own eStore. First I’d heard about that. I verified that physical possession of the object didn’t count, and then bade Phone Gal good day.

OK, so I got screwed, but that’s not what this is about. What’s the real problem here? The eBay seller had tried tell me that the codes could be guessed, but I’m skeptical. Each code has to be associated with a particular type of animal. There are enough digits in the code, and enough hundreds of animal types, that making five guesses per day on the Webkinz sites isn’t really going to be very rewarding. No, the first real problem is that the physical security on the authentic codes is very weak. It’s just a simple slip of paper in a plastic envelope tied shut with a little blue ribbon. There’s no plastic thing that you have to break to get at the code, no scratch area, not even a tamper-evident foil seal on the envelope. It would be trivial to buy the toy, use the code, put the code slip back in the envelope, and re-sell it. The physical security is so poor that it would even be possible to do all of this in the store without purchasing anything, and I suspect that’s where most illicitly used codes come from. I was briefly tempted to do exactly that myself, and I’m pretty sure that’s what the eBay seller intended to do, but I try to be a better person than that.

That’s not really the biggest problem here, though. The biggest problem is Ganz’s attitude. They must be aware of how easy it is to steal or misuse codes, and of how often it actually happens. They could secure the codes better, but that might add a couple of pennies to the price. Sadly, I know enough about our collective “race to the bottom” to understand and almost accept that they couldn’t be expected to do that. Alternatively, they could accept proof of physical possession as proof of virtual possession. That would cost them nothing, and would be the fair thing to do according to every moral standard I can think of. Why don’t they? I think it’s because they don’t want to support any kind of re-sale at all. They want to sell you a brand new toy, at full price, even if the toy you already have is only “not new” by virtue of illicit use that they have practically encouraged. Their position is even worse than the RIAA or MPAA, who have at least had to concede that physical transfer of a CD or DVD transfers rights as well. A stolen code is not a lost sale to them; it’s two sales. Doing the right thing would hurt their business. The status quo suits them just fine, and they don’t care how many children’s tears are shed because of it.

No, Ganz, I will not be buying anything from you. Ever. I will endure Amy’s tears if I have to. I will use this as an opportunity to teach her about how companies sometimes do things that are wrong, about the concept of socially responsible purchase decisions, and about boycotts. Then I’ll substitute some other equivalent gift, perhaps a game or membership on some other site, because it’s not her fault (or my mother’s) that you’re evil. I’m so annoyed that I might even do more than that. You’ve made an enemy.