Dan Kegel (dank@alumni.caltech.edu) wrote:

I ran a benchmark to see how long a call to poll() takes as you increase the number of idle fd's it has to wade through. I used socketpair() to generate the fd's.

Under Solaris 7, when the number of idle sockets was increased from 100 to 10000, the time to check for active sockets with poll() increased by a factor of only 6.5. That's a sublinear increase in time, pretty spiffy.

Linus sarcastically replied:

Yeah. It's pretty spiffy.

Basically, poll() is _fundamentally_ a O(n) interface. There is no way to avoid it - you have an array, and there simply is _no_ known algorithm to scan an array in faster than O(n) time. Sorry.

BZZZT! Wrong, Linus, but thank you for playing. Here's the problem with your reasoning:

It's entirely possible, and has been done in several systems with poll/select equivalents, to keep file descriptors on a list with ready ones at the beginning and unready ones at the end, or to have separate ready/unready lists. As an intermediate hack, an OS could keep a "first-ready-fd" value with a reserved value indicating "nothing happening on any fd". Checking whether there's any activity under any of these systems, then, only requires looking at the one item on the head of a list, and isn't O(n) at all. If you knew anything about other OSes or had any interest in learning, you'd know this. I haven't looked at the Solaris code, but it's extremely reasonable to believe that they've implemented one of these approaches and that's why poll() scales better on Solaris than on Linux.

Learn something, once in a while, OK? Innovation neither started nor ended with Linux. We don't have to reinvent every wheel (see my Linux 2.3/2.4 VFS Layer essay) because we're willfully ignorant of wheels in general.