Stone Age Programming

As a systems programmer, I get to work with a lot of old-fashioned code and tools. The code base I work on every day is in C, complete with manual memory management and constant checking of return values instead of exceptions. Heck, the Gluster coding style even involves using "goto" for most error handling within a function. One could argue that this is all as it should be, and that features such as GC or exceptions don't belong in systems code. Within some systems-programming subdomains that's even true, but certainly not in all. Most often, doing everything the hard way is a vestige of code having been written in a more primitive era and not rewritten since. Nine times out of ten, writing new code in that same style would be somewhere between unwise and idiotic.

Fortunately, as one moves further from the kernel/embedded space, the code rapidly becomes more modern. A lot of newer infrastructure code, from distributed object stores and databases to configuration management and container provisioning, is written in more modern languages. For a long time that meant Java, more recently it's Go, with Python sort of running second all along. You can even add Clojure or Erlang to the list if you want. Clearly, these people understand the value of having more modern features in a programming language. That's why it amazes me to see some of those very same people clinging to an archaic style when it comes to dealing with asynchronous programming. I refer, of course, to the "callback hell" style best known from Javascript and node.js.

    downloadFile('', function(err, data) {  
        console.log('Got weather data:', data);

(from Stack Abuse)

What's happening here is that we're creating an anonymous function, then calling downloadFile and explicitly telling it to call that anonymous function when it's done. This is manual stack management, like you have to do in assembly language. It's what people did back before procedures and functions became commonplace in the 60s. The equivalence becomes even more apparent when you consider a case where we need to pass some of our own data through downloadFile and expect to get it back in our anonymous function. If you're lucky, your language has lambdas to do that. Failing that, you'd better hope that there's a version of downloadFile with an extra "user data" argument for that purpose, because if you don't even have that things get really ugly.

People who advocate this approach to concurrency will try to make it sound all computer-science-y by talking about closures and continuations, but it's still fundamentally a stone-age technique - doing something manually that should be taken care of automatically, like manual memory management vs. automatic GC. They'll also make wild claims about how threads are so inefficient, but never back those claims up. Here's a suggestion: go write a program to see how many thread switches per second you can get on a modern processor. It's in the millions, which is more than enough for most situations. And that's for OS threads, which have to go through the kernel's scheduler. The numbers for user-level ("green") threads are even higher. What's happening is that the callback-hell advocates are conflating thread switches with process switches, which really are expensive because they have to do a lot more. A page fault is often more expensive than a thread switch, but ask one of these "threads are slow" types to explain how they're dealing with page-fault overhead and I guarantee you'll get nothing but a blank stare.

I'm not saying that threads are The Answer to concurrent/asynchronous programming. Far from it. They have their own problems, though a lot of those are really with the way people do locking than with threads themselves. I've written about the actor model before, and more recently I've been moving more toward the promise/future camp. There are several approaches, all subject to the usual kinds of tradeoffs and preferences. All I'm saying is, let's please stop portraying the callback-heavy style as an advance when it's really a huge step back. It's old and rusty, not new and shiny. In fact, if you look under the covers of how any of these callback-oriented systems are actually implemented, you're likely to find a core/engine/reactor that's implemented in a far different and fundamentally better paradigm. Use that directly, instead of layering an archaic style on top of a modern framework.

Comments for this blog entry