Best Comment about the Nobel Peace Prize

By Doghouse Riley, in a comment on a post by Roy Edroso.

take it up with the … Committee. I don’t remember them clearing Milton’s Friedman’s award with me first.

I also think that the award was more about Bush than about Obama. I have said publicly that I believe it was petty, and insulting to other candidates, and I stand by those beliefs. Nonetheless, I believe the people on the right who are going all apoplectic about this, using it as another excuse to drag out the tired old “worst president ever” aping of a true observation about their own Chosen One from 2000-2008, are just being boors. They would have said the same thing if the award had been deserved, and we all know that.

Keyspace and Riak

The space of highly scalable key/value stores is pretty crowded. The roundups at mindstorms and metabrew should give a pretty good feel for what’s out there. Two projects I recently learned about that aren’t covered by either are Keyspace at the more Dynamo-like end of the spectrum, and Riak at the more S3-like end. Both look quite interesting, and I recommend some of the Riak pages as an introduction to things like consistent hashing and Brewer’s CAP Theorem as well. I hope I get a chance to play with them both, and if I do I’ll probably drop a note or two here.

Chronic Fatigue Syndrome Discovery

Since I know multiple people who have, or think they have, CFS, here’s some interesting news: it might be linked to a retrovirus.

Researchers found that two-thirds of people with chronic fatigue are infected with a retrovirus called XMRV, according to a new study in the journal Science Express. XMRV has also been found in the tumors of some prostate cancer patients.

The new study compared blood samples from 101 chronic fatigue patients with samples from 218 healthy people. About 67 percent of the sick people had XMRV, compared with fewer than 4 percent of healthy people.

Not conclusive, and certainly not a cure, but interesting.

Fun With Databases

It’s kind of funny that I’ve been tangling with SQL cultists so much lately, because I’ve also been using SQL more than normal lately. I’m prototyping some code, and I decided to use SQLite for a lot of the data. Why? Partly it’s because it’s a prototype and I’m too lazy to implement several kinds of specialized data structures just yet, but it goes a bit deeper than that. I think it’s hard to design the right data structures without having a detailed description of the operations you will be performing on them, and it just so happens that the list of canned queries that will be executed against the SQLite version is an excellent description of those operations. It’s also a description that’s still changing hour to hour, so designing and redesigning my data structures would be much more of a drag than rephrasing a few queries. It’s something to think about.

Anyway, it should surprise nobody to know that some of my tables represent directory trees. That’s the kind of stuff I do, after all. One problem I’ve run into is that some operations need to be applied to an entire arbitrarily deep tree, and that’s hard to model in database-speak. SQLite parses but doesn’t enforce foreign-key constraints, so that obvious solution is out. Two other solutions present themselves. The first, which I didn’t pursue, would be to treat the position of a file within the tree as a one-to-many relationship between that file and all of its parents all the way up to the filesystem root, with an auxiliary table being the standard way to represent one-to-many relationships. Thus, we end up with something like this, using deletion as an example.

CREATE TABLE main (inode INTEGER, stuff BLOB);
CREATE TABLE aux (child INTEGER, parent INTEGER);
...
DELETE FROM main WHERE (aux.parent == xxx) AND (aux.child == main.inode);

I might be using the wrong kind of join. I don’t care. The real killer is that keeping aux up to date is likely to make object creation much more painful, and likewise for any other operation besides delete (e.g. rename). Instead, I went with another solution based on triggers. SQLite triggers are not recursive. In other words, a trigger X cannot cause itself to fire again – even indirectly. I tested having two identical triggers which would fire each other, but the fun ended after each had fired once. Foiled again. Instead, I ended up with something like this.

CREATE TABLE main (inode INTEGER, inode parent, stuff BLOB, stale INTEGER);
CREATE TRIGGER d_trig DELETE ON main BEGIN
    UPDATE main SET stale = 1 WHERE parent == old.inode;
END;
...
UPDATE main SET stale = 1 WHERE inode == xxx;
DELETE FROM main WHERE stale = 1;

All I need to do then is continue re-executing that last DELETE once per directory level, until it affects zero rows (sqlite3_changes() is the key there). Works like a charm. It’s a bit ugly having the extra stale column, but it doesn’t impose any extra cost on inserts and the approach is pretty readily extensible to operations other than delete (left as an exercise for the reader). Maybe this is a standard trick, though a quick search showed many people asking about the same problem and none proposing this solution (or anything better). Now that I’ve written about it, maybe it will show up in the next guy’s search.

P.S. Yes, I realize these shenanigans wouldn’t have been necessary if I had rolled my own data structures instead of using SQLite. It was a pretty minor speed-bump, though, and the effort to get past it was still less – far less – than the alternative would have required overall.

Insider Trading

If I mention a public company on this blog, and somebody else posts a link to me on Yahoo! Finance, am I constrained with respect to buying or selling that company’s stock? As absurd as it might seem, I suspect the practical answer is yes. Certainly the person posting to Y!F would be subject to such constraints; there have been many cases regarding that “pump and dump” tactic over the years, and rightly so. I suppose that two unscrupulous people could try to avoid such scrutiny by having the blogger do the trading so that the Y!F poster’s hands appear clean. I’m not even sure it would work, and it would require a far more influential blog than this one to have any effect on price, but it’s certainly not hard to imagine someone trying.

I wasn’t planning to buy or sell anything I’ve mentioned here, but it does irk me a bit that somebody I don’t even know could create that kind of trouble for me. If I ever do start picking individual stocks, I guess I’ll also have to start paying attention to inbound links as well.

Cargo Cultism

Cargo cults are an extreme form of the belief that replicating someone’s behavior will lead to replicating their results. In moderation, this belief seems quite reasonable. After all, imitating the actions of those who do something better than we do is a key part of how we learn. It’s often more efficient or less painful than trial and error. The name, though, comes from technologically unsophisticated Pacific islanders observing US troops in World War 2. The islanders got the idea that by imitating the troops’ behavior they would reap the same rewards in the form of “cargo” – air-dropped supplies which were highly valuable to them. The problem, of course, is that the islanders lacked knowledge not only of the technology that was involved but also of the military context behind much of the behavior they were observing. Thus, and constrained also by their own resources, they ended up imitating only the most superficial aspects of that behavior, with often comical and occasionally tragic results.

Cargo cultism isn’t limited to “primitive” people, though. In fact, it’s very common among programmers. Many people think that if they use the same operating system or programming language or text editor as Joe Rockstar, then they will achieve rock-star results themselves. If I do this thing that I don’t really understand, my code will be as secure as Joe’s. If I repeat that mantra in my code, it will be as scalable as Joe’s. There’s a heated discussion on the cloud-computing list at Google Groups, about “NoSQL” (more correctly “non-transactional”) data stores vs. traditional ACID-compliant RDBMSes, and the term “cargo cult” has been applied to both sides – correctly, I think. Many NoSQL advocates do indeed seem like cargo cultists. Their code will never see a scale where a free, well understood and well supported traditional database wouldn’t suffice, but they think that imitating the approaches of the highest-scale sites will bring them great success . . . somehow. There are legitimate problems with this magical belief. Unfortunately, the most strident dissent often comes from the cargo cultists of Web 1.0 and earlier, who subscribe to an equally magical belief that putting all of your data into a transactional RDBMS will solve all of your problems. It’s a battle for supremacy between two cargo cults, not a repudiation of cargo cultism itself. This pattern is actually quite common, unfortunately, usually because the participants on both sides have failed to ask two crucial questions:

  • Were the people I’m imitating actually successful in some relevant way? If somebody’s claim to fame rests on being at six dot-coms (plus a blog and hyper-active participation on mailing lists) then I’d suggest not imitating them. Even if they succeeded in selling a dot-com for big money, that might only indicate business – not technical – ability. Don’t copy code that sank into oblivion after its authors became millionaires; it probably sank for a reason.
  • Is the behavior I’m imitating essential to success, or merely incidental? Distinguishing the essential from the superficial requires drawing the lines from particular behaviors to particular results, which in turn requires understanding the technical context. Imitation can provide shortcuts in implementation, not in understanding; you still need to do the hard work of understanding the technology involved.

It’s easy to fall into the trap of imitating the unsuccessful all too well or imitating the successful quite poorly. Even smart people often end up waiting their whole lives for that cargo to arrive. Distinguishing success from notoriety and substance from style is often harder than mastering the specific skills needed to solve a problem. However, those who learn to ask these questions and imitate only the essential behaviors of the truly successful stand a good chance of succeeding themselves.

My Little Programmer

Last night at dinner, Amy happily announced,

I’m in zeroth grade.

I’m so happy that we got her started early on counting from zero instead of one. :) In a similar vein, a couple of weeks ago we visited her classroom for “Back to School Day” and one of the projects the kids had done was a drawing with an “I can” caption. I can run, I can swim, I can ride a bike.Amy’s said,

I can build a circuit.

Actually the spelling was a bit off (they’re not even trying to work on that yet), but you get the idea. She got a Snap Circuits set for her birthday – or was it Christmas? – and has had a lot of fun with it. I’m sure it won’t be long before her hardware abilities exceed my own.

P.S. Her drawing is really improving lately, too. I’ll have to remember to post some pictures of her art soon.