Comedic Open Storage

Thu 24 October 2013

tags: storage

I've written before about some people's mania for object storage as an alternative to blocks and files. It's a valid model, but I do think its benefits are being pretty drastically oversold. Often there's a lot of FUD about distributed filesystems in particular, from people who clearly don't know the details about what features they have or how they work. As a result, even though some people seem pretty excited about Seagate's new Kinetic Open Storage initiative, I approached it with a bit more skepticism. Here's the short version.

  • It's great that somebody's implementing object storage at this level.

  • This particular implementation is a joke.

I'm not just being nasty for no reason. There's a very real danger, with a technology like this, of early implementations over-promising and under-delivering so badly that by the time a good implementation comes along nobody can get over the bad taste in their mouth from the last version. That's what happened in distributed filesystems twenty years ago. Even though things have improved since then, there are still plenty of people who've never moved past "those things don't work" and don't even do the most basic research into the current state of the art before they go off and implement their own crappy incompatible almost-filesystem storage layers. I don't want object storage to be abandoned like that. I want it to succeed, but to do that it has to offer a better value proposition.

Before I start talking about the ways KOS falls short, I have to start by saying that I'm talking about details and the documentation so far almost seems intended to obscure those details. The wiki is long on rhetoric, short on information. For example, I had to dig a bit to find the maximum size of a key (a potentially wasteful 4KB), and I still haven't found the maximum size of an object. So I cloned the preview repository and found a big steaming pile of javadoc. It's not even the good kind of javadoc; it's a lot more of the "bytearray: an array of bytes" boiler-plate kind. So I might actually be wrong about some of the details. If so, I'll update appropriately.

My first objection has to do with NIH syndrome. After all, these ideas first reached prominence with Garth Gibson's NASD back in 1999, and later influenced the ANSI T10 object-storage standard. Back when it was still a PhD thesis, Ceph used a similar model called EBOFS (since abandoned in favor of btrfs), and there are others as well. Instead of building on - or even acknowledging - these predecessor, Seagate went off and developed Yet Another Object Storage API. Then, instead of documenting wire formats and noting differences vs. things people might already know, they just threw a Java library over the wall. Nice.

The second objection is security. There's a reason the S in NASD stands for Secure. If you want to gang a bunch of these devices together as the basis for a multi-user or multi-tenant distributed system, you'd better think hard about how to handle security. Apparently KOS didn't. There's some fluff about on-disk encryption, but nothing about key management, connection security, the actual semantics of their ACLs, etc. This information is not just "nice to have"; it's absolutely essential before developers can even begin to reason about the system they'll be coding for.

My third and most serious objection has to do with supporting only whole-object GET and PUT operations. That's fine for a key/value store or a deep archival store (the very opposite of "kinetic" BTW) but for anything else it's awful. If the objects can be very large, then updating any part of one involves a horrendous read/modify/write cycle. If they're kept small, then a higher level has to deal with the mapping from larger user-visible objects to smaller Kinetic objects. If there are multiple clients - and when are there not? - then there are some pretty serious coordination problems involved, and apparently not even a "conditional put" to help deal with the obvious race conditions. Instead of abstracting away the details and difficulties of modifying a single byte within an object (the original NASD vision), KOS requires the involvement of a robust coordination layer for even the simplest operations. Building cluster filesystems on top of shared block devices didn't work too well when the blocks were fixed size. Variable-sized blocks with 4KB keys don't change the equation much.

As far as I can tell, this project does very little to help distributed-storage users and developers to meet their needs. Instead it creates false differentiation, disrupting for the sake of disruption or perhaps trying to justify higher margins in a cut-throat industry. It's like a double agent in the object-storage camp, potentially sabotaging others' efforts to have that vision accepted in the broader market.

Comments for this blog entry