Cargo Cultism
Jeff Darcy October 3, 2009 21:06
Cargo cults are an extreme form of the belief that replicating someone’s behavior will lead to replicating their results. In moderation, this belief seems quite reasonable. After all, imitating the actions of those who do something better than we do is a key part of how we learn. It’s often more efficient or less painful than trial and error. The name, though, comes from technologically unsophisticated Pacific islanders observing US troops in World War 2. The islanders got the idea that by imitating the troops’ behavior they would reap the same rewards in the form of “cargo” – air-dropped supplies which were highly valuable to them. The problem, of course, is that the islanders lacked knowledge not only of the technology that was involved but also of the military context behind much of the behavior they were observing. Thus, and constrained also by their own resources, they ended up imitating only the most superficial aspects of that behavior, with often comical and occasionally tragic results.
Cargo cultism isn’t limited to “primitive” people, though. In fact, it’s very common among programmers. Many people think that if they use the same operating system or programming language or text editor as Joe Rockstar, then they will achieve rock-star results themselves. If I do this thing that I don’t really understand, my code will be as secure as Joe’s. If I repeat that mantra in my code, it will be as scalable as Joe’s. There’s a heated discussion on the cloud-computing list at Google Groups, about “NoSQL” (more correctly “non-transactional”) data stores vs. traditional ACID-compliant RDBMSes, and the term “cargo cult” has been applied to both sides – correctly, I think. Many NoSQL advocates do indeed seem like cargo cultists. Their code will never see a scale where a free, well understood and well supported traditional database wouldn’t suffice, but they think that imitating the approaches of the highest-scale sites will bring them great success . . . somehow. There are legitimate problems with this magical belief. Unfortunately, the most strident dissent often comes from the cargo cultists of Web 1.0 and earlier, who subscribe to an equally magical belief that putting all of your data into a transactional RDBMS will solve all of your problems. It’s a battle for supremacy between two cargo cults, not a repudiation of cargo cultism itself. This pattern is actually quite common, unfortunately, usually because the participants on both sides have failed to ask two crucial questions:
- Were the people I’m imitating actually successful in some relevant way? If somebody’s claim to fame rests on being at six dot-coms (plus a blog and hyper-active participation on mailing lists) then I’d suggest not imitating them. Even if they succeeded in selling a dot-com for big money, that might only indicate business – not technical – ability. Don’t copy code that sank into oblivion after its authors became millionaires; it probably sank for a reason.
- Is the behavior I’m imitating essential to success, or merely incidental? Distinguishing the essential from the superficial requires drawing the lines from particular behaviors to particular results, which in turn requires understanding the technical context. Imitation can provide shortcuts in implementation, not in understanding; you still need to do the hard work of understanding the technology involved.
It’s easy to fall into the trap of imitating the unsuccessful all too well or imitating the successful quite poorly. Even smart people often end up waiting their whole lives for that cargo to arrive. Distinguishing success from notoriety and substance from style is often harder than mastering the specific skills needed to solve a problem. However, those who learn to ask these questions and imitate only the essential behaviors of the truly successful stand a good chance of succeeding themselves.
Nice post.
At one point folks thought if they had Aeron chairs just like the high-profile .com companies then they’d be successful too.
Is NoSQL about dropping ACID or dropping joins? I thought it was the latter but I haven’t been watching closely.
Most of the NoSQL chatter I see is about dropping both. Neither actually requires dropping SQL, though. Some parts of SQL clearly would not apply in such an environment. Others would still be quite useful, if only because of their familiarity. Still others would apply in more limited form. For example:
Note that I’ve said nothing about implementation difficulty or actual availability. Even fairly simple SELECT syntax is hard to implement across a clustered (let alone distributed) store/database, and you could get yourself into a whole heap of trouble with delete cascades that aren’t wrapped in transactions. Ew. Many of these features are so useful that it’s worth trying, though, and therefore worth keeping SQL at least as a description language. I think anything involving multiple tables is pretty much a lost cause because it’s really hard to keep those sorts of things from going exponential and that’s fatal for the kind of high-scale environments where these data stores are used. I know a hard problem when I see one, and I have all the respect in the world for the people tackling these problems, but I think full traditional-RDBMS functionality at that scale (and at reasonable cost) will remain out of reach for quite some time. Even those who believe we can get there shouldn’t ignore the question of which parts to jettison in the immediate here-and-now.