Every once in a while, somebody comes up with the “new” idea that eventually consistent systems (or AP in CAP terminology) are useless. Of course, it’s not really new at all; the SQL RDBMS neanderthals have been making this claim-without-proof ever since NoSQL databases brought other models back into the spotlight. In the usual formulation, banks must have immediate consistency and would never rely on resolving conflicts after the fact . . . except that they do and have for centuries.
Most recently but least notably, this same line of non-reasoning has been regurgitated by Emin Gün Sirer in The NoSQL Partition Tolerance Myth and You Might Be A Data Radical. I’m not sure you can be a radical by repeating a decades-old meme, but in amongst the anti-NoSQL trolling there’s just enough of a nugget of truth for me to use as a launchpad for some related thoughts.
The first thought has to do with the idea of “partition oblivious” systems. EGS defines “partition tolerance” as “a system’s overall ability to live up to its specification in the presence of network partitions” but then assumes one strongly-consistent specification for the remainder. That’s a bit of assuming the conclusion there; if you assume strong consistency is an absolute requirement, then of course you reach the conclusion that weakly consistent systems are all failures. However, what he euphemistically refers to as “graceful degradation” (really refusing writes in the presence of a true partition) is anything but graceful to many people. In a comment on Alex Popescu’s thread about this, I used the example of sensor networks, but there are other examples as well. Sometimes consistency is preferable and sometimes availability is. That’s the whole essence of what Brewer was getting at all those years ago.
Truly partition-oblivious systems do exist, as a subset of what EGS refers to that way. I think it’s a reasonable description of any system that not only allows inconsistency but has a weak method of resolving conflicts. “Last writer wins” or “latest timestamp” both fall into this category. However, even those have been useful to many people over the years. From early distributed filesystems to very current file-synchronization services like Dropbox, “last writer wins” has proven quite adequate for many people’s needs. Beyond that there is a whole family of systems that are not so much oblivious to partitions as respond differently to them. Any system that uses vector clocks or version vectors, for example, is far from oblivious. The partition was very much recognized, and very conscious decisions were made to deal with it. In some systems – Coda, Lotus Notes, Couchbase – this even includes user-specifed conflict resolution that can accomodate practically any non-immediate consistency need. Most truly partition-oblivious systems – the ones that don’t even attempt conflict resolution but instead just return possibly inconsistent data from whichever copy is closest – never get beyond a single developer’s sandbox, so they’re a bit of a strawman.
Speaking of developers’ sandboxes, I think distributed version control is an excellent example of where eventual consistency does indeed provide great value to users. From RCS and SCCS through CVS and Subversion, version control was a very transactional, synchronous process – lock something by checking it out, work on it, release the lock by checking in. Like every developer I had to deal with transaction failures by manually breaking these locks many times. As teams scaled up in terms of both number of developers and distribution across timezones/schedules, this “can’t make changes unless you can ensure consistency” model broke down badly. Along came a whole generation of distributed systems – git, hg, bzr, and many others – to address the need. These systems are, at their core, eventually consistent databases. They allow developers to make changes independently, and have robust (though admittedly domain-specific) conflict resolution mechanisms. In fact, they solve the divergence problem so well that they treat partitions as a normal case rather than an exception. Clearly, EGS’s characterization of such behavior as “lobotomized” (technically incorrect even in a medical sense BTW since the operation he’s clearly referring to is actually a corpus callosotomy) is off base since a lot of people at least as smart as he is derive significant value from it.
That example probably only resonates with programmers, though. Let’s find some others. How about the process of scientific knowledge exchange via journals and conferences? Researchers generate new data and results independently, then “commit” them to a common store. There’s even a conflict-resolution procedure, domain-specific just like the DVCS example but nonetheless demonstrably useful. This is definitely better than requiring that all people working on the same problem or dataset remain in constant communication or “degrade gracefully” by stopping work. That has never worked, and could never work, to facilitate scientific progress. An even more prosaic example might be the way police share information about a fleeing suspect’s location, or military units share similar information about targets and threats. Would you rather have possibly inconsistent/outdated information, or no information at all? Once you start thinking about how the real world works, eventual consistency pops up everywhere. It’s not some inferior cousin of strong consistency, some easy way out chosen only by lazy developers. It’s the way many important things work, and must work if they’re to work at all. It’s really strong/immediate consistency that’s an anomaly, existing only in a world where problems can be constrained to fit simplistic solutions. The lazy developers just throw locks around things, over-serialize, over-synchronize, and throw their hands in the air when there’s a partition.
Is non-eventual consistency useful? That might well be the more interesting question.