When I talk about working on a cloud filesystem, I often describe its interface as POSIX or near-POSIX. Most people just accept that, a few kind of look blankly, and a few use it as a cue to ask, “Is POSIX still relevant?” Unfortunately, when this occurs I’m usually in an environment where I have to suppress the eye-roll this induces. A little knowledge is dangerous, and people who ask that question are the living proof. “POSIX” in this context basically just means a familiar filesystem, as implemented by any sane VFS layer (plus Linux’s), with approximately the following functionality.

  • Hierarchical directory structure (readdir, creat/mkdir, link/symlink)
  • Inode abstraction with attributes such as owner, permissions, times (stat/fstat, chown/chgrp/chmod)
  • Read and write of simple byte streams (open/close, read/write, fsync, ftruncate)

Your email is almost certainly stored in a POSIX filesystem somewhere, as are all of your spreadsheets and photos and source code etc. Most of what web servers deliver comes from a POSIX filesystem either directly or indirectly (e.g. from an RDBMS which is loaded from a filesystem and stores data in a filesystem). For any one of these examples there might be other options as well, but not always (I’ve yet to see a compiler that operates on any other kind of storage) and in any case there’s only one that satisfies all of these needs – all day, every day, for billions of users. The people who question POSIX’s relevance generally don’t know how their supposed alternatives actually work, or that in the dim and distant past there really were alternatives to the POSIX model but those alternatives were found sorely lacking. (The same, BTW, is often true of other POSIX standards besides those involving filesystems.)

All that said, most filesystems are only 99% POSIX. Many of the divergences have to do with consistency/ordering/durability issues. NFS is (in)famous for cutting corners in many of these areas, as are ext[234], and most people neither know nor care until they try to run their software on a different system and it unexpectedly fails (e.g. because PVFS2 doesn’t allow access to unlinked files). For the most part, though, relatively trivial differences such as character sets and name lengths and forward-slash vs. back-slash are the only ones people notice. So go ahead and use whatever other storage model you want. Just remember that sooner or later your data will probably pass through a POSIX filesystem (and from there to a T10/T11/T13 block device) which has a lot to say about how it will perform or how well it’s protected. Too often I’ve seen people test some other kind of storage that runs on top of a filesystem, just assuming that the filesystem will do its job, then use those very same results to proclaim the irrelevance of POSIX. It’s like driving to a conference where you proclaim the irrelevance of cars.