Several years ago, Amazon created something called S3 - Simple Storage Service. The "simple" part was based on the premise that distributed file systems are too complex, inhibiting scalability while providing too little marginal value to users. According to that theory, a system with a simpler API and semantics (e.g ...read more
There's a lot of hype around distributed file systems and their relatives, such as object stores. Every week, it seems, there's a new project claiming to be the the fastest, most scalable, most robust, most space-efficient distributed file system ever, sweeping all precursors before it. Nine times out ...read more
Usually, when anyone in government tries to do anything about issues of equality or fairness, the techie-libertarian reaction is to complain about "legislating equal outcomes" and invoke the spectre of Harrison Bergeron as proof. (Hint: it's fiction!) For some reason, "neutrality" doesn't get the same reaction even though ...read more
Of all the projects I've proposed or worked on for GlusterFS, New Style Replication (NSR) is one of the most ambitious. It has two major goals:
Improved handling of network partitions
Improved performance, both normally and during repair
Personally, I consider the improved partition handling to be the more ...read more
Many people think of erasure coding as equivalent to replication, but with better storage utilization. Want to store 100TB of data with two-failure survivability? With replication you'll need 300TB of physical storage; with 8+2 erasure coding you'll need only 125TB. Yeah, sure, there's a performance downside ...read more
Just some random thoughts from an email I sent recently, plus a bonus SCSI war story.
As the PVFS folks said long before I came along, some POSIX requirements are inappropriate for a distributed file system. I agree with that, but not with the object-store folks who claim that the ...
A recent discussion on the GlusterFS development mailing list got a bit hung up on the issue of what is or is not a "DSO" (Dynamically Shared Object). This is one of a many issues with dynamic linking and dynamic loading that I've seen cause problems before, in large ...read more
Because of what I do for $dayjob, I hear a lot about "scale out" vs. "scale up" in various contexts. Also because of what I do for $dayjob, I get to read a lot of code. Some of it's new and clean. Some of it's . . . not. That's ...read more
(...and now for something completely different.)
Back in July, I started running. That would not be a particularly notable statement for many people, but most people haven't detested running all their lives and avoided it for thirty years. Instead, I've used stairclimbers and ellipticals for many years, but ...read more
Forgive me, Lord, for I have sinned.
I have written distributed systems in languages prone to race conditions and memory leaks.
I have failed to use model checking when I should have.
I have failed to use static analysis when I should have.
I have failed to write tests that ...
Without further ado...
Never heard of it.
Yeah, I hear all the hipsters yammering about it.
I checked out the docs and examples once.
I used it for a side project.
We're using it for some new projects at work.
We're using it in production.
We're using ...
I've written and talked many times about storage benchmarking. Mostly, I've focused on how to run tests and analyze results. This time, I'd like to focus on the parts that come before that - how you set up the system so that you have at least some chance ...read more
Every month or two, someone comes along and claims to be the new Best Thing Ever in distributed file storage. More often than not, it's just another programmer who recently discovered things like consistent hashing and replication, then slapped together another HTTP object store because that's what people ...read more
I know a lot of people are going to be asking me about Red Hat's acquisition of Inktank, so I've decided to collect some thoughts on the subject. The very very simple version is that I'm delighted. Occasional sniping back and forth notwithstanding, I've always been ...read more
This afternoon, I'll be giving a talk about (among other things) my current project at work - New Style Replication. For those who don't happen to be at Red Hat Summit, here's some information about why, what, how, and so on.
First, why. I'm all out of ...read more
The other day, I was talking to a colleague about the debate within OpenStack about whether to chase Amazon's AWS (what another colleague called the "failed Eucalyptus strategy") or forge its own path. It reminded me of an idea that was given to me years ago. I can't ...read more
A lot of people have asked when GlusterFS is going to have support for tiering or Hierarchical Storage Management, particularly to stage data between SSDs and spinning disks. This is a pretty hot topic for these kinds of systems, and many - e.g. Ceph, HDFS, Swift - have announced upcoming support ...read more
Way back when I was a young pup, either in college or after that but before I started my career, I got to use an operating system called MTS. That stands for Michigan Terminal System. It was created to run on IBM (and later Amdahl) mainframes, when U of M ...read more
This is a story about the dark side of moving your stuff into the cloud. It does have a (reasonably) happy ending, but along the way there are some important lessons to be learned about the relationship between cloud users and cloud providers, and how it's possible for people ...read more
For a while now, Kyle Kingsbury has been doing some excellent work evaluating the consistency and other properties of various distributed databases. His latest target is Redis. Mostly I agree with the points he makes, and that Redis Cluster is subject to inexcusable data loss, but there is one point ...read more
I was around when shared libraries were still a new thing in the UNIX world. At the time, they seemed like a great idea. On multi-user systems like those I worked on at Encore, static linking meant not only having a separate copy of the same code in every program ...read more
When I wrote about how local filesystems suck a while ago, it sparked a bit of debate. Mostly it was just local-filesystem developers being defensive, but Dave Chinner did make the quite reasonable suggestion that I could help by proposing a better alternative to the fsync problem. I've owed ...read more
Ever since one of the talks at LISA, I've been thinking about secure email. My thoughts are nowhere near complete, but I need to get them out of my head and I do that by writing about them. Apologies in advance.
I've actually been thinking for many years ...read more
(This started as a Hacker News discussion about an article on Advogato. The articles title/premise is "Why You Need STONITH" where "STONITH" means "Shoot The Other Node In The Head" and is an important concept in old-school HA. I might even have been present when the acronym was coined ...read more
In April of '89 I left my family and friends to move from Michigan to Massachusetts for a programming job. The new job paid twice as much as my first programming job had, which means three times as much as I was making since that company laid me off, so ...read more
Model checking is one of the most effective tools available for reducing the prevalence of bugs in highly concurrent code. Nonetheless, a surprising number of even very smart and very senior software developers and architects seem to know about it. Of the many such people I've worked with over ...read more
I've often said that open-source distributed storage solutions such as GlusterFS and Ceph are on the same side in a war against more centralized proprietary solutions, and that we have to finish that war before we start fighting over the spoils. Most recently I said that on Hacker News ...read more
Distributed filesystems represent an important use case for local filesystems. Local-filesystem developers can't seem to deal with that. That, in a nutshell, is one of the most annoying things about working on distributed filesystems. Sure, there are lots of fundamental algorithmic problems. Sure, networking stuff can be difficult too ...read more
To a first approximation, "software engineering" refers to all of the things you need to know when you take "programming" and try to scale it up - more code, more people, more time. You don't need an a civil engineer to dig a latrine, but you'd better have one ...read more
It's time to let some cats out of some bags. As my loyal readers (yeah right) have surely noticed, things have been quiet around here. Part of that has been the result of vacations and such, but also there's a lot of stuff I just haven't felt ...read more
And now for something completely different...
As part of my job - educating and evangelizing and whatever else you call it - I travel a fair amount. I know there are other people who travel ten times as much as I do, but then there are many more who travel less than ...read more
Sometimes people ask me why I always use small synchronous writes for my performance comparisons. Surely (they say), there are other kinds of operations that are more common or more important. Yes there are (I say), and don't call me Shirley. But seriously, folks, there are definitely other kinds ...read more
One of the problems with measuring and comparing performance of scalable systems is that any workload capable of producing meaningful results is going to be highly multi-threaded, and most developers don't know much about how to collect or interpret the results. After all, they hardly ever get any training ...read more
We're moving to an "agile" development process at work. Yes, we're becoming scrumbags. ;) One of the terms that really bothers me is "sprint" because I think of a sprint as a flat-out effort. That means minimal eating, sleeping, or time with family. Even hard-core hackers rarely do that ...read more
There are many things that differentiate a true software engineer from a mere programmer. Most of them are unpleasant - planning releases, reviewing designs or code, testing, release engineering, and so on. One of the most odious tasks is packaging software. I'll admit that it's an area where my ...read more
You might have noticed that things look a bit different around here. OK, if you're reading this in an RSS reader then maybe not, but otherwise it's kind of obvious. I've switched platforms yet again, because I was feeling a bit blocked. Publishing new stuff using my ...read more