I wrote about availability and partition tolerance in my last post, so I guess I should tackle consistency as well. Informally, “consistency” can refer to two slightly different things: conforming to some set of rules, or getting similar results for similar actions. Both parts apply to consistency in distributed systems: consistency means conforming to rules about who sees what values when. In its strongest form, consistency means a complete global ordering of reads and writes in a distributed system. Let’s say we’re dealing with an integer value that starts at 6. Two updates occur: X which adds 4, and Y which divides by 2. Furthermore, we know that there’s a causal relationship determining the order of the two updates because some other kind of event – e.g. a lock operation or message exchange – came in between. A complete global ordering ensures two things.
- No node will ever see the result of Y before X. In other words, they might see 6/10/5 depending on when they read, but they will never see 3.
- Once any node has seen the result of an update, no node will ever see a state without it. Once Z sees 10, nobody can see 6 again; once Z sees 5, nobody can see 10 again.
Eventual consistency breaks both of these rules. A node might in fact see Y before X, temporarily yielding 3. A moment later, another node might see X first and get 10. What remains true, though, is that once any node sees 10 then no other node should see 7. That’s because of the guarantee that eventual consistency does make.
For any pair of updates X and Y, all nodes will eventually converge on a result that reflects X and Y applied in the same order.
In other words, it’s OK if different nodes see different values or see them at different times, so long as eventually – meaning when both X and Y have worked their way through the system – they agree on the same value. If a system allows conflicting values to persist indefinitely, it’s not eventually consistent; it’s simply inconsistent. Note that there’s no limit on how long it takes for the system to quiesce. There’s also no restriction on what order everyone agrees on. The eventual agreed-upon result could be 7, representing Y before X and thus violating the causal relationship, but very few actual systems would make that choice. In fact, it’s entirely possibly to add additional guarantees on top of eventual consistency. Some of the most common are:
- Read your own writes. The node that issued X should never thereafter see a result (e.g. 3) without it, even though other nodes might. Some people do actually manage to screw this one up, but it’s rare.
- Processor consistency. If the same node issues X and Y, then nobody should ever see Y without X.
- Never go back. A single node in our example will always proceed 6-10-5 even if it does so out of sync with other nodes.
Preservation of causal relationships is one of the most complex guarantees that is typically provided in eventually-consistent systems. Using mechanisms such as vector clocks, it is possible for a node receiving Y to observe that there might be an as-yet-unreceived X that should go before. It can then defer processing of Y until X arrives, or until it’s no longer possible for any such X to exist. Such careful ordering can be defeated by out-of-band messages that overtake the in-band messages, though, since by definition the information necessary to (re)establish ordering is not present out-of-band. Application designers who depend on such guarantees must therefore exercise some caution (or remove the dependency).
The last point I’d like to make about eventual consistency is one that I recently made via email. Eventual consistency just means updates are asynchronous. It doesn’t mean that they’re slow, or unordered, or non-atomic. In practice, extra guarantees such as the above can be added so that all of the consistency behaviors that matter to a particular application are preserved just as well as with strong consistency. Even when that’s not the case, inconsistency is in practice likely to be rare and fleeting. I vaguely recall seeing some numbers for the frequency of read-repair operations in Dynamo, which should approximately reflect the level of inconsistency that actually occurs, but I can’t find them right now. What matters is that eventual consistency is often strong in practice, even if it’s not guaranteed, and that’s sufficient for many – I think most – applications. There are important categories of applications for which guaranteed strong consistency really is a requirement (anything to do with money or nuclear materials come to mind) and people writing such applications should consider whether the software they’re using really meets those requirements, but they shouldn’t be driving technology for absolutely everyone.