Michael Stonebraker has lobbed a bomb into the NOSQL camp. He makes a perfectly good point that NOSQL has less to do with SQL than with ACID – a point I’ve tried to make many times myself, and which is reflected in the recent adoption of “Not Only SQL” as a preferred anti-acronym. Then he goes off the rails a bit by talking about scalability while showing no awareness of the CAP Theorem. The CAP Theorem is a classic triangle; you can have any two of Consistency, Availability, and Partitionability. Availability in this sense doesn’t mean eventual availability any more than consistency means eventual consistency; it means the timely availability of data when you ask for it. Similarly, partitionability doesn’t just mean that there are multiple nodes (that’s just scalability) but that some might be unreachable. I’m sure Stonebraker knows all this, but for some reason he seems to ignore it as he argues that you don’t need to sacrifice consistency for availability. He’s right . . . if you’re willing to sacrifice partitionability instead. He almost makes this connection when he points out that auto-sharding databases do exist and that many NOSQL stores do have distributed operation (i.e. partitionability) as a focus, but seemingly fails to appreciate the significance of that juxtaposition within a CAP conceptual framework. Yes, an auto-sharding database built with consideration for all of his four performance-limiting features can maintain C and A, but that was an obvious result already and it’s not relevant to the large number of NOSQL systems which are about A and P. (Some systems do concentrate on C and P, but they seem much less common or fashionable right now.) His supposed example is flawed not only by that, but also by the fact that of course an in-memory database will get great pseudo-TPC numbers. Unfortunately, real TPC results require durability and only a minority of serious real-world datasets will economically succumb to memory-based approaches unless and until CPU/memory/bandwidth balances change more than they have or are likely to. “Hooray” for those who can benefit, “who cares” for everyone else.
In the end, I think Stonebraker’s attempted criticism of NOSQL actually validates it. The flaws in his own “have your cake and eat it too” analysis show that CAP is very real, and that NOSQL folks are making a very reasonable choice when they choose to abandon C. It’s not the only reasonable choice, not even for the environments where those solutions are often developed (see Yahoo’s PNUTS for yet another perspective on the tradeoffs involved), but the people making it do know what they’re doing. Why can’t we all just get along?