Other articles


  1. Updating POSIX

    "POSIX is obsolete." If you're a filesystem developer, you've probably heard that many times. I certainly have. It doesn't tell me anything I didn't already know about POSIX, but it does tell me two things about whoever says it.

    • They don't know what POSIX is.

    • They're lazy.

    To the first …

    read more
  2. Object Store File Systems

    Several years ago, Amazon created something called S3 - Simple Storage Service. The "simple" part was based on the premise that distributed file systems are too complex, inhibiting scalability while providing too little marginal value to users. According to that theory, a system with a simpler API and semantics (e.g …

    read more
  3. Distributed File System SBFAQ

    There's a lot of hype around distributed file systems and their relatives, such as object stores. Every week, it seems, there's a new project claiming to be the the fastest, most scalable, most robust, most space-efficient distributed file system ever, sweeping all precursors before it. Nine times out of ten …

    read more
  4. Life on the Server Side

    Of all the projects I've proposed or worked on for GlusterFS, New Style Replication (NSR) is one of the most ambitious. It has two major goals:

    • Improved handling of network partitions

    • Improved performance, both normally and during repair

    Personally, I consider the improved partition handling to be the more important …

    read more
  5. How Erasure Coding is Not Like Replication

    Fri 13 February 2015

    tags: storage

    Many people think of erasure coding as equivalent to replication, but with better storage utilization. Want to store 100TB of data with two-failure survivability? With replication you'll need 300TB of physical storage; with 8+2 erasure coding you'll need only 125TB. Yeah, sure, there's a performance downside, but from a …

    read more
  6. Notes on File System Semantics

    Fri 09 January 2015

    tags: storage

    Just some random thoughts from an email I sent recently, plus a bonus SCSI war story.

    As the PVFS folks said long before I came along, some POSIX requirements are inappropriate for a distributed file system. I agree with that, but not with the object-store folks who claim that the …

    read more
  7. Wannabe of the Month: Skylable

    Every month or two, someone comes along and claims to be the new Best Thing Ever in distributed file storage. More often than not, it's just another programmer who recently discovered things like consistent hashing and replication, then slapped together another HTTP object store because that's what people nowadays do …

    read more
  8. Inktank Acquisition

    I know a lot of people are going to be asking me about Red Hat's acquisition of Inktank, so I've decided to collect some thoughts on the subject. The very very simple version is that I'm delighted. Occasional sniping back and forth notwithstanding, I've always been a huge fan of …

    read more
  9. New Style Replication

    This afternoon, I'll be giving a talk about (among other things) my current project at work - New Style Replication. For those who don't happen to be at Red Hat Summit, here's some information about why, what, how, and so on.

    First, why. I'm all out of tact and diplomacy right …

    read more
  10. Data Gravity

    In the last few days, I had an interesting exchange on Twitter about the concept of data gravity. For convenience, I'll include the relevant parts here.

    • Mat Ellis: Interesting piece by @mjasay link … @randybias is right on the money, data gravity is already a big deal on the cloud

    • me …

    read more
  11. Tiers Without Tears

    A lot of people have asked when GlusterFS is going to have support for tiering or Hierarchical Storage Management, particularly to stage data between SSDs and spinning disks. This is a pretty hot topic for these kinds of systems, and many - e.g. Ceph, HDFS, Swift - have announced upcoming support …

    read more
  12. The World Is Not Flat

    Way back when I was a young pup, either in college or after that but before I started my career, I got to use an operating system called MTS. That stands for Michigan Terminal System. It was created to run on IBM (and later Amdahl) mainframes, when U of M …

    read more
  13. Fixing Fsync

    When I wrote about how local filesystems suck a while ago, it sparked a bit of debate. Mostly it was just local-filesystem developers being defensive, but Dave Chinner did make the quite reasonable suggestion that I could help by proposing a better alternative to the fsync problem. I've owed him …

    read more
  14. Comedic Open Storage

    Thu 24 October 2013

    tags: storage

    I've written before about some people's mania for object storage as an alternative to blocks and files. It's a valid model, but I do think its benefits are being pretty drastically oversold. Often there's a lot of FUD about distributed filesystems in particular, from people who clearly don't know the …

    read more
  15. SAN Stalwarts and Wistful Thinking

    I've often said that open-source distributed storage solutions such as GlusterFS and Ceph are on the same side in a war against more centralized proprietary solutions, and that we have to finish that war before we start fighting over the spoils. Most recently I said that on Hacker News, in …

    read more
  16. Local Filesystems Suck

    Distributed filesystems represent an important use case for local filesystems. Local-filesystem developers can't seem to deal with that. That, in a nutshell, is one of the most annoying things about working on distributed filesystems. Sure, there are lots of fundamental algorithmic problems. Sure, networking stuff can be difficult too. However …

    read more
  17. GlusterFS 3.5 Features

    It's time to let some cats out of some bags. As my loyal readers (yeah right) have surely noticed, things have been quiet around here. Part of that has been the result of vacations and such, but also there's a lot of stuff I just haven't felt ready to write …

    read more
  18. Small Synchronous Writes

    Sometimes people ask me why I always use small synchronous writes for my performance comparisons. Surely (they say), there are other kinds of operations that are more common or more important. Yes there are (I say), and don't call me Shirley. But seriously, folks, there are definitely other kinds of …

    read more
  19. Performance Measurement Pitfalls

    One of the problems with measuring and comparing performance of scalable systems is that any workload capable of producing meaningful results is going to be highly multi-threaded, and most developers don't know much about how to collect or interpret the results. After all, they hardly ever get any training in …

    read more
  20. Lies, Damn Lies, and Parallels

    This apparently happened a while ago, but it recently came to my attention via LWN that James Bottomley has made the claim that "Gluster sucks" (not a paraphrase, those seem to be his exact words). Well, I couldn't just let that go by, could I? Why would he say such …

    read more
  21. Metadata Servers

    I was sad that I had to miss RICON East, because I knew they had a lot of great speakers lined up. I really liked James Hughes's presentation, but must take issue with slide 15.

    Metadata Servers

    Required by traditional filesystems (POSIX) to translate names to sectors

    Hard to scale …

    read more