Why "DSO" is an Awful Term

A recent discussion on the GlusterFS development mailing list got a bit hung up on the issue of what is or is not a "DSO" (Dynamically Shared Object). This is one of a many issues with dynamic linking and dynamic loading that I've seen cause problems before, in large part because they're two different things that people often mix up. I'll try to explain how this fact leads to confusion, and suggest how to avoid that confusion.

For the sake of this discussion, let's separate the two kinds of things that "DSO" might refer to. We'll use "library" to mean something that is specified when linking an executable, and is therefore reflected in that executable's on-disk contents. By contrast, a "module" is not specified when linking and not reflected in the on-disk executable; one must use dlopen from within the program to get at it. Despite their differences, both of these are dynamically linked. In both cases, the executable lacks a complete symbol table for the shared object (in the module case it lacks any symbol table at all). The library or module's symbols will be resolved when it is loaded. In fact, this late resolution is essential to make any kind of shared object work, on any platform, so the "D" in "DSO" is kind of redundant.

The difference between libraries and modules is that modules are also dynamically loaded whereas libraries are not. Libraries are implicitly loaded into a process's memory space before the process starts (i.e. before main is called). Modules are explicitly loaded only when dlopen is called. Either way, loading includes mapping library/module contents into a process's memory. In the dynamic-linking case it also includes resolving symbols, but it is actually possible to do dynamic loading without dynamic linking (see my Quora answer on this topic for more details) so this is not essential.

Where did all of this go wrong? Apparently it's Apple's fault. In their infinite arrogance, and contrary to every other UNIX platform, they decided that the same shared object could not be used as both module and library. It had to be one on the other. While precluding dual use without reason is generally a bad decision technically, Apple then made it worse by using "DSO" to mean only modules and not libraries. Is the "D" what really distinguishes an Apple DSO from an Apple non-DSO? Nope. That didn't stop them, and it didn't stop the libtool folks either. They never saw a stupid idea they didn't like, so they mindlessly copied Apple's bad terminology (including the "module" flag). This has led to much confusion since, including that which inspired this post.

So, if "DSO" doesn't work, what would? Surprisingly, it's not the "D" but the "S" that must go. Everything I've said so far about dynamic linking and loading would apply even if the objects in question are not shared. What we're really talking about here is two kinds of dynamically linked objects. On every platform but Apple's, the loading issue doesn't matter so "DLO" would be sufficient to distinguish these from statically linked libraries. However, we've seen that Apple's choices and terminology do infect others. Where the loading distinction does matter, it's between implicit (or immediate) loading vs. explicit loading. That would lead us to the rather unwieldy IDLO and EDLO. Alternatively, we could embrace the "library" vs. "module" distinction, resulting in DLL and DLM. Yes, DLL. Microsoft pretty much got this one right, folks. It's a technically acccurate term, which would also be common across the Windows and UNIX/Linux platforms, so how is that a bad thing?

Sigh. But we programmers aren't so rational, as a group. Apple's not going to change. Libtool won't either. They'll both continue to use "DSO" inaccurately and misleadingly. At least now maybe the term will raise a red flag, and people will know to ask for clarification. When someone says "DSO" ask them whether they mean all things that are dynamic and shared and objects, or just some arbitrary Apple-defined subset.

Comments for this blog entry