Ned Batchelder posted about another single- vs. multi-threading discussion, which has been a regular feature here as well. This time it cropped up in the context of using SQLite (an awesome embedded database as Ned says) from within a multi-threaded program. The first round is from SQLite’s author, D. Richard Hipp, and my response is on the next page:

Actually, this seems like a good opportunity to repeat my oft-ignored advice to not use more than one thread in a single address space. If you need multiple threads, create multiple processes. This has nothing to do with SQLite = it is just good programming advice. I have worked on countless multi- threaded programs over the years, and I have yet to see a single one that didn’t contain subtle, hard to reproduce, and very hard to troubleshoot bugs related to threading issues.

There are times when threading is appropriate and there are times when it’s not. As I put it in my server-design article:

One popular variant of this approach is to use only one thread, ever; while such an approach does avoid context thrashing, and avoids the need for locking as well, it is also incapable of achieving more than one processor’s worth of total throughput and thus remains beneath contempt unless the program will be non-CPU-bound (usually network-I/O-bound) anyway.

If you want to take advantage of having multiple processors – we’re now in a world where the latest version of the processor most people use on their desktop has two physical and four logical processors on a single die with multi-die systems being common – you have to use multiple threads. Yes, it’s harder to do right. Yes, there might be more bugs that are harder to find, and not only when the people writing the code are clueless. That’s also true of writing network code, but they’re not excuses why either shouldn’t be done. By all means, if you’re writing a type of application that does not need to harness multiple processors and your programming model supports it, use one thread. Don’t assume everyone else lives in such a convenient and simple world, though. The advice to use multiple processes instead is just horrible from a performance standpoint because a process switch does everything a thread switch does and then some. In fact, it’s amazing to hear anyone associated with SQLite make the multiple-process suggestion considering that much of the reason people might use SQLite instead of (for example) MySQL is precisely because it lives in the same process and thus does not require a full context switch per operation. One might think that indicates some awareness of the tradeoff between context-switch cost vs. separate-address-space safety, but apparently not.

I’d also like to address some specific comments in the other thread about single-threaded versions of programs touching fewer pages and running faster than their multithreaded counterparts. The first part is just bunk, confusing correlation with causation. I’ve seen many multithreaded programs that have good locality of reference, and single-threaded ones that are terrible. The second part is often true under light load but that’s just not interesting in many cases. A web server, for example, typically exhibits a particular level of throughput for a single connection, then increasing throughput as connections increase until saturation is reached, then either flat or declining performance thereafter. Who cares about the single-connection performance? Just about nobody, unless it’s really awful. What really matters in that and many similar environments is the height of that peak and the number of connections necessary to reach it. It’s common for multi-threaded servers to start at a slightly lower point (nobody cares) but have a much higher peak at a much higher number of connections (people care a lot). If this is the class of program you work with, which is CPU rather than network- or I/O-bound, you simply can’t make excuses about the difficulty of multithreading. You have to deal with that difficulty instead if you want to be competitive with other products using similar platforms.