One of the most useful metaphors in software engineering is Ward Cunningham's technical debt. Definitions and interpretations vary, but technical debt is basically all the stuff you're going to fix later because you were in too much of a hurry to do it right the first time. We ...read more
A recent discussion on the GlusterFS development mailing list got a bit hung up on the issue of what is or is not a "DSO" (Dynamically Shared Object). This is one of a many issues with dynamic linking and dynamic loading that I've seen cause problems before, in large ...read more
Because of what I do for $dayjob, I hear a lot about "scale out" vs. "scale up" in various contexts. Also because of what I do for $dayjob, I get to read a lot of code. Some of it's new and clean. Some of it's . . . not. That's ...read more
(...and now for something completely different.)
Back in July, I started running. That would not be a particularly notable statement for many people, but most people haven't detested running all their lives and avoided it for thirty years. Instead, I've used stairclimbers and ellipticals for many years, but ...read more
Forgive me, Lord, for I have sinned.
I have written distributed systems in languages prone to race conditions and memory leaks.
I have failed to use model checking when I should have.
I have failed to use static analysis when I should have.
I have failed to write tests that ...
Without further ado...
Never heard of it.
Yeah, I hear all the hipsters yammering about it.
I checked out the docs and examples once.
I used it for a side project.
We're using it for some new projects at work.
We're using it in production.
We're using ...
I've written and talked many times about storage benchmarking. Mostly, I've focused on how to run tests and analyze results. This time, I'd like to focus on the parts that come before that - how you set up the system so that you have at least some chance ...read more
Every month or two, someone comes along and claims to be the new Best Thing Ever in distributed file storage. More often than not, it's just another programmer who recently discovered things like consistent hashing and replication, then slapped together another HTTP object store because that's what people ...read more
I know a lot of people are going to be asking me about Red Hat's acquisition of Inktank, so I've decided to collect some thoughts on the subject. The very very simple version is that I'm delighted. Occasional sniping back and forth notwithstanding, I've always been ...read more
This afternoon, I'll be giving a talk about (among other things) my current project at work - New Style Replication. For those who don't happen to be at Red Hat Summit, here's some information about why, what, how, and so on.
First, why. I'm all out of ...read more
The other day, I was talking to a colleague about the debate within OpenStack about whether to chase Amazon's AWS (what another colleague called the "failed Eucalyptus strategy") or forge its own path. It reminded me of an idea that was given to me years ago. I can't ...read more
A lot of people have asked when GlusterFS is going to have support for tiering or Hierarchical Storage Management, particularly to stage data between SSDs and spinning disks. This is a pretty hot topic for these kinds of systems, and many - e.g. Ceph, HDFS, Swift - have announced upcoming support ...read more
Way back when I was a young pup, either in college or after that but before I started my career, I got to use an operating system called MTS. That stands for Michigan Terminal System. It was created to run on IBM (and later Amdahl) mainframes, when U of M ...read more
This is a story about the dark side of moving your stuff into the cloud. It does have a (reasonably) happy ending, but along the way there are some important lessons to be learned about the relationship between cloud users and cloud providers, and how it's possible for people ...read more
For a while now, Kyle Kingsbury has been doing some excellent work evaluating the consistency and other properties of various distributed databases. His latest target is Redis. Mostly I agree with the points he makes, and that Redis Cluster is subject to inexcusable data loss, but there is one point ...read more
I was around when shared libraries were still a new thing in the UNIX world. At the time, they seemed like a great idea. On multi-user systems like those I worked on at Encore, static linking meant not only having a separate copy of the same code in every program ...read more
When I wrote about how local filesystems suck a while ago, it sparked a bit of debate. Mostly it was just local-filesystem developers being defensive, but Dave Chinner did make the quite reasonable suggestion that I could help by proposing a better alternative to the fsync problem. I've owed ...read more
Ever since one of the talks at LISA, I've been thinking about secure email. My thoughts are nowhere near complete, but I need to get them out of my head and I do that by writing about them. Apologies in advance.
I've actually been thinking for many years ...read more
(This started as a Hacker News discussion about an article on Advogato. The articles title/premise is "Why You Need STONITH" where "STONITH" means "Shoot The Other Node In The Head" and is an important concept in old-school HA. I might even have been present when the acronym was coined ...read more
In April of '89 I left my family and friends to move from Michigan to Massachusetts for a programming job. The new job paid twice as much as my first programming job had, which means three times as much as I was making since that company laid me off, so ...read more
Model checking is one of the most effective tools available for reducing the prevalence of bugs in highly concurrent code. Nonetheless, a surprising number of even very smart and very senior software developers and architects seem to know about it. Of the many such people I've worked with over ...read more
I've often said that open-source distributed storage solutions such as GlusterFS and Ceph are on the same side in a war against more centralized proprietary solutions, and that we have to finish that war before we start fighting over the spoils. Most recently I said that on Hacker News ...read more
Distributed filesystems represent an important use case for local filesystems. Local-filesystem developers can't seem to deal with that. That, in a nutshell, is one of the most annoying things about working on distributed filesystems. Sure, there are lots of fundamental algorithmic problems. Sure, networking stuff can be difficult too ...read more
To a first approximation, "software engineering" refers to all of the things you need to know when you take "programming" and try to scale it up - more code, more people, more time. You don't need an a civil engineer to dig a latrine, but you'd better have one ...read more
It's time to let some cats out of some bags. As my loyal readers (yeah right) have surely noticed, things have been quiet around here. Part of that has been the result of vacations and such, but also there's a lot of stuff I just haven't felt ...read more
And now for something completely different...
As part of my job - educating and evangelizing and whatever else you call it - I travel a fair amount. I know there are other people who travel ten times as much as I do, but then there are many more who travel less than ...read more
Sometimes people ask me why I always use small synchronous writes for my performance comparisons. Surely (they say), there are other kinds of operations that are more common or more important. Yes there are (I say), and don't call me Shirley. But seriously, folks, there are definitely other kinds ...read more
One of the problems with measuring and comparing performance of scalable systems is that any workload capable of producing meaningful results is going to be highly multi-threaded, and most developers don't know much about how to collect or interpret the results. After all, they hardly ever get any training ...read more
We're moving to an "agile" development process at work. Yes, we're becoming scrumbags. ;) One of the terms that really bothers me is "sprint" because I think of a sprint as a flat-out effort. That means minimal eating, sleeping, or time with family. Even hard-core hackers rarely do that ...read more
There are many things that differentiate a true software engineer from a mere programmer. Most of them are unpleasant - planning releases, reviewing designs or code, testing, release engineering, and so on. One of the most odious tasks is packaging software. I'll admit that it's an area where my ...read more
You might have noticed that things look a bit different around here. OK, if you're reading this in an RSS reader then maybe not, but otherwise it's kind of obvious. I've switched platforms yet again, because I was feeling a bit blocked. Publishing new stuff using my ...read more