At the Linux Foundation’s recent End User Summit, I had the pleasure of meeting K.S. Bhaskar from FIS. Recently he wrote an article on his blog about Eventual State Consistency vs. Eventual Path Consistency in which he has some particularly interesting things about different kinds of consistency guarantees.
there are applications where detection of colliding updates does not suffice to ensure Consistency (where Consistency is defined as the database always being within the design parameters of the application logic – for example, the “in balance” guarantee that the sums of the assets and liabilities columns of a balance sheet are always equal).
He then gives an example showing an apparent problem with two financial transactions and their associated service charges, across two sites while a service-charge rate change is still “in flight” between them. I originally responded there, but my reply seems to have disappeared. Maybe it got lost due to a conflict with a subsequent update. ;) In any case, I might as well respond here because I think his example highlights an important issue. I don’t think Bhaskar’s example really demonstrates the problem he had described. In the last step he says that
B detects a collision, since it observes that the data that was read by the application logic on A to compute the result of transaction P has changed on B
How could B observe such a thing? Only if it knew either the data that was read on A (i.e. the service-charge rate in effect for the transaction was included as part of the replication request) or the exact replication state on A at the time P was processed there (e.g. by using vector clocks or similar). Either way, it would have enough information to replicate the transaction in a consistent fashion.
The real problem would be if B didn’t know whether or not the rate change had reached A yet when P was processed there. That would result in B needing to distinguish between two possible states that would have to be handled differently, but with no way to make that distinction. The general rule to avoid these kinds of unresolvable conflicts is: don’t pass around references to values that might be inconsistent across systems. It’s like passing a pointer from one address space to a process in another; you just shouldn’t expect it to work. Either pass around the actual values or do calculations involving those values and replicate the result. For example, consider the following replication requests.
# $var indicates immediate substitution from the original context
# %var indicates a transaction-local variable
# Wrong: sc_rate is passed by reference and interpreted at destination
replicate transaction "transfer #zzz" {
acct_x -= $amt * (1.0 + sc_rate);
acct_y += $amt;
}
# Right: sc_rate is interpreted at source and passed by value
replicate transaction "transfer #zzz" {
%sc_rate = $sc_rate;
acct_x -= $amt * (1.0 + %sc_rate);
acct_y += $amt;
}
# Right: service charge is calculated at source
# works, but not good for auditing
amt_with_sc = amt * (1.0 + sc_rate)
replicate transaction "transfer #zzz" {
acct_x -= $amt_with_sc;
acct_y += $amt;
}
# Right: service charge as separate transaction
sc = amt * sc_rate;
replicate transaction "transfer #zzz" {
acct_x -= $amt;
acct_y += $amt;
}
replicate transaction "service charge for #zzz" {
acct_x -= $sc;
}
In an ideal world, the interface and behavior for the replication subsystem would disallow or strongly discourage the wrong form. For example, it could require that any values meant to be interpreted or modified at the destination must be explicitly listed or tagged, and reject anything that abuses “extraneous” variables as in the first form above. (Auto-conversion of the first form into the second is likely to introduce its own kinds of unexpected behavior.) That would force people to use one of the methods that actually works.
For what it’s worth, I’ve seen both the 3rd and 4th approaches in my bank statement, which is a paper way of replicating my account status from the bank’s database to me. If circumstances force me to withdraw cash from a “foreign” ATM, I’ll typically see the following in my statement (assume I withdrew $100):
transaction N: debit $101.50, withdrawal from bank X’s atm 123
transaction N+1: debit $2.00, foreign atm fee
That is, transaction N uses method #3, because the other bank doesn’t tell my bank that it’s a $100 withdrawal with a $1.50 fee, they just tell my bank the total amount removed from my account. In transaction N+1, my bank tells me they’re charging $2.00. Interestingly, they don’t explicitly tell me that the fee in transaction N+1 was caused by transaction N, though as far as I’ve seen they’re always immediately sequential (implicit inter-transaction semantics, yuck!)
That’s a good example, Jeremy. One of the most common arguments I hear in favor of synchronous transactions and strong consistency is that you can’t afford to handle financial data any other way. Several months ago I saw that Eric Brewer himself – one of the most cited advocates of eventual consistency as a building block for scalable systems – had gotten so tired of this argument that he devoted considerable time in a keynote at a high-profile conference describing how banks actually work and have worked for decades. Guess what? It’s not using synchronous distributed transactions. It’s using local operations and eventual consistency. Conflicts are not entirely eliminated, but they are rare enough and the audit trails are good enough that resolving them through the courts is less costly than failing transactions because of a mere network glitch would be. If eventual consistency is good enough for that use case, I say, it’s good enough for a great many others as well. In my experience a common pattern is to have a very small strongly consistent core to handle the operating parameters of the system (which rarely change) combined with a more weakly consistent system for the actual data.
And yes, the implicit inter-transaction semantics are yucky. That’s why I added the string-valued transaction identifiers in my example above, even though the notation is all made up. Even if such IDs are not used during normal operation, it’s nice to have them visible for audit/recovery/debugging purposes.
I see the error in my post. Instead of “Instance B detects a collision” I should have said “Instance B should detect a collision”.
Banks today, especially in the US, are remarkably primitive. A lot of core processing is done by batch systems working against non-database flat files. My “day job” is developing a transaction processing NoSQL database engine whose largest usage is for real time core banking, and the problem I blogged about is one that we are working on for future solutions. I wanted to illustrate the difference between what real time core banking needs – eventual path consistency – with what many off-the-shelf databases provide today – eventual state consistency.