The retreat I attended was, as usual, a combined retreat for two projects: OceanStore and ROC (Recovery-Oriented Computing). Also as usual, there was a second retreat going on at the same place and time, for SAHARA, with a few joint sessions etc. I’ve never quite gotten a handle on exactly what the SAHARA folks are trying to achieve, but it’s basically about a low-level network infrastructure that’s shared by multiple providers (especially of the celphone variety) instead of being controlled by one.

One major bummer at the retreat was that neither David Patterson (he of RISC, RAID, and the highly-regarded computer architecture books) nor Dennis Geels (one of the OceanStore “core crew”) was able to attend, in both cases due to illness. In addition to being deprived of their fine company, this caused several minor rearrangements of the schedule around the talks that they had been scheduled to give. There was still lots of interesting stuff, though, most of which has been very well summarized here. What follows are some of my own personal, idiosyncratic thoughts, beyond what was captured in the official visitor feedback pages (thanks to George C and Aaron B for capturing those).

  • A Utility-Centered Approach to Designing Dependable Internet Services (George Candea)

    This work seems, not belittle it in any way, to be in a pretty early stage, but its goals fit well with my own preference for making decisions on a quantitative “rational” basis instead of just by the seat of one’s pants. In a nutshell, the idea is to define utility functions based on several orthogonal metrics – e.g. data availability, performance and cost – and use those metrics to create a multi-dimensional design space. Design alternatives at successively finer levels of detail can then be selected by plotting each within the design space and selecting the one with the greatest overall utility according to some formula (e.g. greatest Euclidean distance from the zero-utility origin).

    My own idea about this was that merely assigning weights to the different axes, or combining them using different formulae, would be insufficient due to the issue of design or implementation risk (or quality). It’s highly unlikely that a design alternative would be adequately represented by a single point along any axis; more often, especiallly with finite schedules and resources, the proper representation would be a probability function. When the axes are combined, they therefore produce a field within the design space rather than a single point. The fields representing alternatives might vary in density or even be discontinuous; they might even overlap with one another. What this approach makes possible, though, is to compare alternatives based on a selectable certainty level. Even more importantly, it allows the decision to be remade whenever the probabilities or schedules or comfort levels change, without having to redo every part of the computation.

  • Tapestry (various)

    This wasn’t exactly related to any particular talk, but the issue of the relationship between topological closeness within Tapestry and geographic or IP-network closeness seemed to keep coming up. In particular, Brocade seems to be an attempt to reconcile the two. Similar issues arise not only in Tapestry, but also in similar Plaxton/Chord/Kademlia/CAN networks. If I were a grad student myself, I’d be seriously thinking about a thesis on how to construct such networks so that the potential for “optimal at one level, terrible at the next” routing was minimized.

  • Simultaneous Insertions in Tapestry (Kris Hildrum)

    First off, I’d like to point out that this presentation had to be put together in a hurry to fill a schedule gap resulting from one of the aforementioned absences, and insertion/membership algorithms are hellishly difficult to explain even with all the time in the world. I’m sure a more followable explanation will be forthcoming soon.

    The important thing about this work is, really, that it’s being done successfully at all. The OceanStore group seems very acutely aware of the problems that high “churn” rates in a network can bring, and they’re making significant progress on protocols that can survive and yield correct results even in such an environment.

  • The OceanStore Write Path: A Quantitative Analysis (Sean Rhea)

    There’s a lot of interesting stuff here, but there’s one particular observation that I think deserves special mention: hashing isn’t free. Calculating something like an SHA-1 hash has all the system effects of a copy (which we all know by now to avoid) plus computational overhead, and hashing speed can actually become the main bottleneck in this type of system. Yeah, I know, to some it might seem obvious, but I see many others falling into the trap nonetheless. It’s worth the effort to manage hash information so it doesn’t need to be regerated, to consider weaker (but cheaper) hashes, or to adopt algorithms that don’t depend on such hashes.

  • Towards Building the OceanStore Web Cache (Patrick Eaton)

    As I pointed out in my feedback, the historical record of adoption for distributed filesystems is pretty bad, and systems to handle email attachments haven’t fared much better. IMO, the OceanStore web cache is by far the most likely of the applications developed so far to actually benefit real people in the real world (if that’s a goal). Making that happen might involve some not-so-sexy hacking to deal with crappy web consistency models, but I think it’s a very important direction and I hope this work continues.

  • 100-Year Storage (various)

    There was supposed to be a talk, or perhaps a breakout session, about the reasonableness of designing a storage system to last 100 years, but it kind of got lost in the shuffle. Given that many technological advances are likely to occur in the next century (duh) I think it’s critical that any system designed for such a timescale must also be designed to evolve over time, but that introduces a whole bunch of difficult problems. Dealing with multiple protocol versions is a major pain even in a very small network (been there, done that); is it even possible in a network of the size anticipated for OceanStore? Regenerating lost fragments that have been deliberately scattered all over the world is already a hard problem in OceanStore; how much harder is it to migrate fragments to a new format, hash type, or key length without disrupting anything? I think there’s a lot of brainstorming left to be done in this area.

  • Time Travel Data Store (various)

    Everywhere I go nowadays, whether it be to startups or to academic research labs, the same idea seems to keep coming up: a data store that lets you move backwards or forwards to any arbitrary point in time (not just to when you took the last snapshot). OceanStore could potentially act in this way, but there are pieces missing. Again, I think this is a very fruitful area for more research.