After writing recently about some of the less-savory tactics that are often used in positioning technical products/projects, another ugly example of FUD has raised its head in the NoSQL world. No, I’m not referring to the “Facebook is abandoning Cassandra” silliness, which I think Eric Evans dealt with quite adequately. I’m talking about the MongoDB will throw away your data silliness, where somebody involved in a competing project expresses “sincere concern” about data durability. Mikeal actually raises some excellent points about data durability. I have worked for many years in environments where such things are taken very seriously, and he’s correct on many of the technical points such as the inadequacy of mmap-based approaches for ensuring recoverability. I do like MongoDB’s feature set, which has led me to use it for one project, but I admit that my faith in their ability and intention to fix some of the durability issues is just that – faith. It’s faith based on knowing something about the people involved, and knowing – contra Mikeal – that the problems do not run so deep as to be unsolvable without a major overhaul, but it’s faith rather than current reality. Mikeal’s points would be better made without some of the exaggerations and misrepresentations that others have taken pains to correct in the comments, and especially without remarks like this:

Using an append-only file is the preferred, sane and most assured way to handle data loss or corruption.
[after being corrected and reminded of the soft-update alternative]
while soft updates are safe their inventor and implementor in UFS have admitted are very hard to get right.

“Hard to get right” and “insane” are not the same. Some things are intrinsically hard to get right but are still worth doing. That includes all of the hard work kernel people do on placement/alignment and ordering/flushing to make sure those append-only files work so well even for systems-programming tyros. It also includes some of the compaction/vacuuming problems that append-only files introduce (closely related to “segment cleaning” in log-structured filesystems). Two of the best-respected filesystems out there – ZFS and btrfs – are primarily based on COW rather than append-only logs/journals, though ZFS does have an intent log as well. VoldFS also uses COW, and I’ll gladly debate Mikeal on the merits of my “insane” choice for that environment or use case.

The claim that the approach used by CouchDB is the only sane and assured one doesn’t help anyone. It’s merely partisan, not constructive. As Kristina Chodorow points out in the comments, it’s irritating when somebody just says “MongoDB ate my data” or “CouchDB is slow” without providing any specifics that can be addressed. (Since Mikeal seems to feel differently, I’m sure he won’t mind me mentioning that “CouchDB is a hog” is exactly the response I got when I suggested using it to replace MongoDB in that afore-mentioned project because I prefer its replication model.) As Dwight Merriman is also quoted in the comments as saying, the days of one-size-fits-all storage are over. MongoDB’s choices and roadmap might not suit everybody. Neither do CouchDB’s. There’s nothing wrong with proponents of one project engaging in constructive dialog regarding issues in others, but it does help if such criticism is in fact constructive and if people don’t consistently offer far more or sharper criticism than they are willing to accept themselves.