Object Mania

Apparently, at RICON East today, Seagate’s James Hughes said something like this.

Any distributed filesystem like GlusterFS or Ceph that tries to preserve the POSIX API will go the way of the dodo bird.

I don’t actually know the exact quote. The above is from a tweet by Basho’s Seth Thomas, and is admittedly a paraphrase. It led to a brief exchange on Twitter, but it’s a common enough meme so I think a fuller discussion is warranted.

The problem here is not the implication that there are other APIs better than POSIX. I’m quite likely to agree with that, and a discussion about ideal APIs could be quite fruitful. Rather, the problem is the implication that supporting POSIX is inherently bad. Here’s a news flash: POSIX is not the only API that either GlusterFS or Ceph support. Both also support object APIs at least as well as Riak (also a latecomer to that space) does. Here’s another news flash: the world is full of data associated with POSIX applications. Those applications can run just fine on top of a POSIX filesystem, but the cost of converting them and/or their data to use some other storage interface might be extremely high (especially if they’re proprietary). A storage system that can speak POSIX plus SomethingElse is inherently a lot more useful than a storage system that can speak SomethingElse alone, for any value of SomethingElse.

A storage system that only supported POSIX might be problematic, but neither system that James mentions is so limited and that’s what makes his statement misleading. The only way such a statement could be more than sour grapes from a vendor who can’t do POSIX would be if there’s something about supporting POSIX that inherently precludes supporting other interfaces as well, or incurs an unacceptable performance penalty when doing so. That’s not the case. Layering object semantics on top of files, as GlusterFS does, is pretty trivial and works well. Layering the other way, as Ceph does, is a little bit harder because of the need for a metadata-management layer, but also works. What really sucks is sticking a fundamentally different set of database semantics in the middle. I’ve done it a couple of times, and the impedance-mismatch issues are even worse than in the Ceph approach.

As I’ve said over and over again in my presentations, there is no one “best” data/operation/consistency model for storage. Polyglot storage is the way to go, and POSIX is an important glot. I’ve probably used S3 for longer than anyone else reading this, and I was setting up Swift the very day it was open-sourced. I totally understand the model’s appeal. POSIX itself might eventually go the way of the dodo, but not for a very long time. Meanwhile, people and systems that try to wish it away instead of dealing with it are likely to go the way of the unicorn – always an ideal, never very useful for getting real work done.


Leave a Reply