Every once in a while I search for my name on Google (yeah, like you need a link to that). It’s not so much to find others’ references to myself, since most of the hits are usually to stuff that I wrote myself, but every once in a while such a reference does pop up. Yesterday, I found this one:

There’s a long thread on MacinTouch about OS X and fragmentation. Worrying about fragmentation is so Personal Computer. A few people in that thread set the whiners straight about just how traditional unix file systems handle file writes so as to minimize fragmentation although I’m not entirely sure that’s how HFS+ works. Jeff Darcy would be sure to know.

Well, Steve, as gratifying as I find your confidence in me, I have to admit that I don’t know much about how HFS+ works. Truth be told, most of my Mac work was done before HFS+ even existed. Damn, I’m old.

However, there is something in your comment that I feel deserves a reply. UNIX filesystems tend to do a lot to prevent fragmentation and generally reduce head motion – preallocation, cylinder groups, blah blah blah – but it does still occur and most filesystems don’t actually do all that much to undo it once it exists. Consider this quote from one of IBM’s many excellent articles about filesystems:

ext2 filesystems take a long time to exhibit signs of fragmentation. However, I would argue that fragmentation is still a big problem, because although ext2 does not get fragmented easily, fragmentation is a one-way, cumulative process. That is, while ext2 fragments slowly, it cannot defragment itself. In other words, any often-modified ext2 filesystem will gradually get more and more fragmented, and thus slower. Even worse, there are no production-quality ext2 filesystem defragmenting programs currently available.

There are defragmentation tools for ext2, but a lot of “people who should know” don’t seem to trust them very much. You’ll often find a suggestion to dump/restore a filesystem periodically to reduce fragmentation. The same story applies in general to FFS, UFS, etc. I don’t know about JFS/XFS/Reiserfs/etc. but I will say this: after-the-fact defragmentation functionality integrated into the filesystem isn’t a whole lot different than the same functionality implemented as a separate program. In some ways it’s worse, since it significantly increases the complexity, size, and potential for error in a kernel component where those are all much worse problems than in user space.

In conclusion, UNIX filesystems are not immune to fragmentation, though they are more resistant to it than FAT (NTFS is really much more like a UNIX filesystem than anything else, in practically all respects). Furthermore, the tools for fixing fragmentation on UNIX are actually worse than their DOS/Windows counterparts. However, worrying about fragmentation is still “so Personal Computer” because real computers are attached to real RAID storage systems with huge caches and internal optimization so that host-based filesystems’ notions of physical location and fragmentation are all wrong anyway. Physical placement has become a storage-side problem, not a host-side problem, and that will be even more the case when the next generation of “storage virtualization” cruft becomes available.