Yes, you read that right – Zettar, not Zetta. Robin Harris mentioned them, so I had a quick look. At first I was going to reply as a comment on Robin’s site, but the reply got long and this site has been idle for a while so I figured I’d post it here instead. I also think there are some serious issues with how Zettar are positioning themselves, and I feel a bit more free expressing my thoughts on those issues here. So, here are my reactions.

First, if I were at Zetta, I’d be livid about a company in a very similar space using a very similar name like this. That’s discourteous at best, even if you don’t think it’s unethical/illegal. It’s annoying when the older company has to spend time in every interaction dispelling any confusion between the two, and that’s if they even get the chance before someone writes them off entirely because they looked at the wrong website. The fact that Robin felt it necessary to add a disambiguating note in his post indicates that it’s a real problem.

Second, Amazon etc. might not sell you their S3 software, but there are other implementations – Project Hail’s tabled, Eucalyptus’s Walrus, ParkPlace. Then there’s OpenStack Storage (Swift) which is not quite the same API but similar enough that anyone who can use one can pretty much use the other. People can and do run all of these in their private clouds just fine.

Third, there are several packages that provide a filesystem interface on top of an S3-like store – s3fs, s3fuse, JungleDisk, even my own VoldFS (despite the name). Are any of these production-ready? Perhaps not, but several are older than Zettar and several are open source. The difference in deployment risk for Zettar vs. alternatives is extremely small.

Lastly, Zettar’s benchmarks are a joke. The comparison between local Zettar and remote S3 are obviously useless, but even the comparison with S3 from within EC2 is deceptive. Let’s just look at some of the things they did wrong:

  • They compare domU-to-domU for themselves vs. EC2 instances which are likely to be physically separate.
  • They fail to disclose what physical hardware their own tests were run on. It’s all very well to say that you have a domU with an xxx GHz VCPU and yyy GB of memory, but physical hardware matters as well.
  • Disk would have mattered if their tests weren’t using laughably small file and total sizes (less than memory).
  • 32-bit? EC2 m1.small? Come on.
  • They describe one set of results, failing to account for the fact that EC2 performance is well known to vary over time. How many runs did they really do before they picked these results to disclose?
  • To measure efficiency, they ran a separate set of tests using KVM on a notebook. WTF? Of course their monitoring showed negligible load, because their test presents negligible load.

There’s nothing useful about these results at all. Even Phoronix can do better, and that’s the very lowest rung of the storage-benchmarking ladder.

This looks like a distinctly non-enterprise product measured in distinctly non-enterprise ways. They even refer to cloud storage 2.0, I guess to set themselves apart from other cloud-storage players. They’ve set themselves apart, all right. Beneath.

Disclaimer: I work directly with the Project Hail developers, less so with the OpenStack Storage folks. My own CloudFS (work) and VoldFS (personal) projects could also be considered to be in the same space. This is all my own personal opinion, nothing to do with Red Hat, etc.