I ran some tests to measure performance implications of the multithreading approaches mentioned yesterday. The tests consisted of a server implemented using a chosen model, plus two single-threaded clients running on separate machines just blasting simple requests at the server as fast as they can and ignoring responses. Each request has a 97% chance of completing without any sort of delay; the remaining 3% experience a delay of exactly one second.

Here are some results:

  • Asymmetric multithreading: 7K requests/sec
  • Symmetric multithreading: 20K requests/sec
  • On-demand multithreading: 23K requests/sec
  • Event-based (“monothreading”): 24K requests/sec

Obviously, asymmetric multithreading sucks. It’s by far the worst model for this (common) kind of workload. It’s also notable that the big jump in performance comes when you switch from asymmetric to symmetric multithreading. I had expected the big jump to be between symmetric and on-demand multithreading, but here I have to confess that I made an error yesterday. In asymmetric multithreading there are at least two context switches per request, not one: from the listener to the worker and back to the listener. Symmetric multithreading cuts this in half by requiring only one worker-to-worker switch per request, and that’s why there’s a big jump. Apparently, there aren’t that enough avoidable context switches left after that to make much of a difference.

The most interesting comparison, perhaps, is between on-demand multithreading (ODM) and event-based programming (EBP). For starters, there’s just not that much difference. ODM benefits more from parallel operation, while EBP involves less coordination overhead, but in the end it’s a wash. Perhaps more importantly, ODM is more “transparent”; it doesn’t require explicit state maintenance like EBP does, handles certain common constructs (e.g. retry loops) more cleanly, and does not rely on an environment that supports true callback-based asynchronous I/O.

It seems to me that EBP only makes sense when it’s fully supported by the environment, when the reduction in coordination overhead outweighs the reduced parallelism, and there’s no less painful way to squeeze a few percent out of your application. I know a lot of EBP advocates are likely to read this and disagree with me, but I have to say that I see no performance argument favoring EBP over reasonably-implemented multithreading. If anybody would like to make such an argument, it won’t be very convincing unless it’s based on real numbers for realistic workloads.