For a while now, I’ve been carrying a digital camera (Canon S-100 “Digital Elph”) in the same bag that I use to carry my laptop, PDAs, celphone, etc. to/from work. This morning I thought of a new use for it. The parking structure next to where I work is a dangerous place. I constantly hear about people’s cars getting dinged, and I got a few scratches myself before I started avoiding the too-tight inside spaces – the ones labeled “compact cars only” even though on any given day at least half of them seem to be occupied by pickups, SUVs and minivans. However, one can never be too careful. As an additional safety measure, I’m going to start using my digital camera to snap a picture of the cars on either side of me every morning that I park there. That way, if I find any marks I can match them up against the colors and – if necessary – license plate numbers of those cars. It’s much easier than writing down the same info in a PDA or notebook, and seems like a great way to take advantage of the way digital photos that you don’t keep are absolutely free.
As to the high end… Fiberchannel is a step forward, but not enough. Forget all these special purpose buses anyway… my suggestion would be to put a gigabit ethernet interface and an IP stack directly in the drive.
IP is a poor match for storage needs, IMO. TCP in particular was designed – and designed rather well – for the high-latency small-packet environment of the Internet, but storage is a low-latency large-packet world. It’s also a world where the hardware must cooperate in ensuring a high level of data integrity, where robust and efficient buffer management is critical, etc. etc. etc. Even on cost, the equation does not clearly favor storage over IP. Sure, you get to use all of your familiar IP networking gear, but it will need to be upgraded to support various storage-related features already present in FC gear. Even on the controller end, do you really think a GigE interface plus an embedded IP stack is easier or cheaper to incorporate into a controller design than FC? I could go on, but I hope you get the point. “One size fits all” is a bankrupt philosophy. Let IP continue to be designed to suit traditional-networking needs, and for storage use something designed to suit storage needs.
Better to run something like GFS directly on the drive.
No, not better at all. Who wants the drive to be a bottleneck or SPOF? The whole point of something like GFS is to avoid those problems via distribution. Putting an IP stack on the drive is bad enough, and now you want to put a multiple-accessor filesystem on it? Dream on. People used to put things like networking stacks and filesystems on separate devices, because main processors were so wimpy, but they stopped doing that more than a decade ago. For a reason.
huge RAID arrays with one smart control node (like NetApps, etc)
NetApp doesn’t make disk arrays. If you look at the people who do make high-end disk arrays, you’ll see that they have far more than one brain. A big EMC, IBM, or Hitachi disk array is actually a very powerful multiprocessing computer in its own right, that just happens to be dedicated to the task of handling storage.
one drive per brain, a full computer in each drive, each drive a full node on the network
…at which point you’re back to distributed systems as they exist today, wondering how to connect each of those single brains to its single drive with a non-proprietary interface. Going around in circles like that doesn’t seem very productive to me.
Today’s entry is a tip for Windows users. I want to keep my bookmarks “favorites” in sync between my desktop at work, my laptop, and my desktop at home. Two solutions immediately present themselves:
- Use BookmarkSync.
- Store the bookmarks in a shared folder mount that shared folder from all of my machines.
To tell the truth, BookmarkSync works just fine, and it’s free. In fact, I’ve been using it for maybe a year now, and highly recommend it. But I’m a geek, so I decided to play around and see if there’s another way. The key (in more than one sense) turns out to be a registry setting:
HKCUSoftwareMicrosoftWindowsCurrentVersionExplorerUser Shell Folders
The obvious thing to do is to point this registry key on all machines to a shared folder. Voila! It works, but I wanted to be even more creative. I want my bookmarks to be available on my laptop even when I can’t connect to work. Well, to do that I use the much-overlooked Windows “Briefcase” function. On the laptop, I make a sync copy of the shared favorites folder in my briefcase, then point the registry key to the sync copy instead of the original. Voila again! Changes are not propagated into the briefcase until I tell it to sync, but that’s not too much of a hassle.
But wait; there’s more. Let’s say that my shared favorites folder is at work. What about my desktop at home, which never connects to work? Well, I have a solution to that too, which is really not that useful but it shows off an even more creative way to abuse briefcase functionality. I go into my briefcase folder on my laptop, and share the favorites from there. Then, on my home desktop, I make a sync copy of the laptop’s sync copy, then point the desktop’s registry key to the second-generation sync copy. That’s right, folks; my laptop is now a briefcase proxy.
Cumbersome? Absolutely. Silly and pointless? You bet. BookmarkSync is a much better solution which syncs automatically instead of on demand (twice – once from shared folder to laptop, again from laptop to home desktop), but it was a fun exercise nonetheless. Besides, who knows if BookmarkSync will always be a free service? Maybe some day I’ll have to pay or switch. Now I know how to switch if I need to…and so do you.
I’m in a philosophical mood today. I’ve been re-reading George R.R. Martin’s Song of Ice and Fire series (A Game of Thrones, A Clash of Kings, A Storm of Swords); it got me thinking about promises and loyalty. Specifically, two questions:
- When you make a bargain or arrangement or contract with someone, you’re making a promise to someone else and they to you. How related are those two promises? Under what conditions of their breaking their promise are you freed from yours? Obviously it’s different for a business contract and an oath of fealty and a marriage. Is there a single framework within which these disparate ideas can fit? Being somewhat mathematically minded, my first instinct is to reach for some sort of “point system”. Each part of the agreement is worth a certain number of points, say, and if you fall N points behind the other person is allowed to violate some equal-points part of the agreement. Maybe at a certain deficit threshold the entire agreement is considered null and void, or maybe there’s a specified cost (e.g. in dollars) per point. Simple agreements would just have one point on each side. Yeah, it’s way too quantitative for human agreements, but maybe there’s something useful in there for defining or enforcing rules in a distributed system.
- What happens when the party with which you made an agreement is absorbed by another, or thrust into conflict? This sort of thing comes up in the books when a lesser knight or lord swears fealty to a greater lord, who in turn is vassal to a king. Then the greater lord rebels. What’s the lesser lord to do? To whom do they owe loyalty? A very different twist on the same thing in the modern world is what happens when the company you work for gets bought out. To what extent do you owe loyalty to your new employer, particularly when they do not hold to – or even know about – the non-contractual understandings you had reached with your previous boss before the buyout? Maybe that’s a reason to get everything in writing ahead of time, but IMO that way lies madness. Besides my revulsion for the idea that any written deal unconditionally takes precedence over any spoken one, nobody can anticipate all of the future deals and possible conflicts that can occur. Divided loyalty and irreconcilable promises are always going to happen; whatever you might do to avoid them, there are still interesting questions of how to deal with them when they inevitably occur anyway.
I don’t have any conclusions about all of this. About the only concrete thought that occurs to me is that, if I ever own a business that’s about to go public or be sold, I should stop and think about whether I’m “selling out” my employees or creating unnecessary ethical dilemmas for them.
Zooko got me thinking about multi-threaded vs. event-based programming again. Damn you, Zooko! ;-) Here’s my contribution to the ongoing dialog.
One of my favorite observations is that all programmers get to eat worms – sometimes you just get to choose which can to open. This is a perfect example of that phenomenon. I’m going to offer some “compare and contrast” comments, but first I want to point out that both are full of dangers for the poorly prepared – deadlock, livelock, race conditions, timing/ordering problems, you name it. Also, I need to point out that there are many flavors of event-based programming. Finite state machines allow you to use explicit state tables and a greater number of event handlers to replace a smaller number of event handlers full of conditionals. The “promises” used in E let you create arbitrarily complex webs of actions which will occur when other actions complete, with certain safety guarantees. It’s even possible to have a program that’s both multithreaded and event-based, if you allow parallel event-handler dispatch. All flavors of event-based programming tend to have a lot in common, though, so my comments are pretty “flavor-agnostic” except that I tend to assume that events are being used as an alternative to multithreading and thus I tend to assume that event-based programs are not also multithreaded.
- Event-based programs tend to perform better on uniprocessors, and when event handlers can all complete quickly. Spawning new threads is notoriously expensive, and context switches don’t come for free either. Point: EB.
- Multithreaded programs can perform better on multiprocessors, when event handlers might need to block for extended periods, or when the code makes external calls where no asynchronous interfaces are available. In this last case, adaptations to event-based programming are possible but almost always exact a heavy toll on performance. Point: MT.
- With multithreading, you always seem to get an endless proliferation of locks that you’re constantly taking and releasing, always having to worry about lock hierarchies and deadlock. Point: EB.
- With event-based programming you always seem to get an endless proliferation of events and/or states and their associated functions. Point: MT.
- Both deadlock and livelock are problems with both approaches, usually taking the form of dead-end states or event starvation when using the event-based model. E’s promises offer some hope for the future in this regard. Half point: EB.
- Event-based programming requires that all state be explicit, and (largely for this reason) loops are not handled well. Common idioms such as retry loops or scanning multiple objects can get to be very painful. Point: MT.
- Event-based programs can be more conservative of stack space because everything runs directly from the dispatcher. On the other hand, the absence of stack traces can make debugging more difficult. On the third hand, event logs can take the place of stack traces in most cases. Half point: EB.
As you can see, it’s pretty much a draw. That’s really the point: there’s room for both. Neither is “wrong”; use the right tool for the job at hand. Often one concern or another will leave you with no choice. Idealism and purity are for scientists and dreamers; engineers who have jobs to do should consider all alternatives and all factors that allow them to choose one alternative over another.
Today’s topic is breaking software with “upgrades”. Did you read my previous whine about Joust? Well, now shockwave.com has “upgraded” their games section. Along the way, the high-score lists disappeared. They added a rating functionality which spits out an error when you try to use it, but it doesn’t matter because there’s no way to see games’ ratings anyway. Worst of all, they “upgraded” the game engine itself so that it runs at different speeds on different hardware, meaning that it’s so insanely fast on my desktop machine at work that it’s practically unplayable. Elsewhere, slashdot.org has “upgraded” their site, so now it’s just totally broken. I can’t even reach it now. My website hosts seem to be having trouble, too; web service works, but email and telnet logins don’t work.
Nice going, shockwave and slashdot. You just have no idea how much your users appreciate the way you break stuff that was working before. You just have no clue…period.
Back when I was younger, I was addicted to Joust. Recently I discovered a (mostly) true-to-life version on shockwave.com, and I’m hooked again. As I play it now, though, I’m noticing some of the diabolical things the game does to kill players. Here are some examples; some are excusable, some are just examples of the programmer being a jerk. None of them will make sense if you haven’t played the game.
- Have you ever noticed how important the shape of the computer players is, compared to your own shape? You have your head over your lance, effectively making it very short, while their lances stick straight out unimpeded so they can reach under that middle island and poke you from a distance. You have little stubby wings, while they have these huge things set way up high, almost like having a second lance in the back. Cute, huh?
- Anybody who has played at all knows that that every computer-controlled bird is stronger than yours. Here’s what you probably didn’t know: their birds are even stronger when they’re near you. Yes, that’s right. Have you ever noticed how that enemy who’s been struggling to get free of the lava troll for the whole level suddenly gets free as soon as you’re about to kill them? That’s not coincidence; your proximity is what gave them the extra “juice”to get free. This little fact also explains that incredible “kick” an enemy sometimes gets from bouncing off a stone island when you’re nearby, or how that enemy you totally had beaten suddenly gets above you just as you meet.
- Closely related is the fact that enemy birds don’t actually flap. Instead, the game decides how much lift they should have, moves them appropriately, and then moves their wings (if possible, which it often isn’t) to match their vertical motion. Their movement remains perfectly smooth, unlike yours which involves lots of bouncing up and down as you flap. The most obvious manifestation of this is the infamous “superflap” that allows them to stay right at the very top of the screen with their wings a blur. This smoothness also allows them to sail under the middle island to get you, in a way that you with your need to flap could never manage.
No list of complaints about Joust would be complete without a dishonorable mention of Digital Eclipse, the company that obtained the rights to the game and licenses their port to Shockwave. DE’s port is simply broken. What happens is that, when the game is busy, it queues up to a half-second or so worth of keyboard events, then either executes them all in a flurry (resulting in a most unwelcome burst of motion, usually lateral and often getting you killed) or just drops them entirely (resulting in you dropping like a stone and often getting you killed). What’s really annoying about this is that there’s not even a support address or anything you can use to alert them to the problem. They’re apparently more interested in playing with their sickeningly cute website than in fixing – or even hearing about – major bugs in their licensed products. Such people are a disgrace to the programming profession.
On a different note, does anyone else think Joust would be the best starting point for a video-game version of Harry Potter’s “Quidditch”?
The second thought has to do with communications protocols. Zooko was having a problem in Mojo Nation that involved events on a queue being processed out of order, and it turned out to be related to the fact that sometimes the system clock goes backwards. Here’s the sequence:
- Event A is scheduled, at current time plus X.
- Clock runs backwards.
- Event B is scheduled, at current time (now earlier than before) plus X.
- Event B runs before event A.
Well, the obvious solution to this problem would be to ensure that the clock never runs backwards, and in effect that’s what Zooko did. However, I got to thinking about why this was ever a problem. First, consider what our expectations should be when we schedule an event:
- The handler for an event should run at or after its scheduled time.
- If two events are scheduled for different times, the one with the later time should not be dispatched before the one with the earlier time.
In a true real-time system, we may add one more expectation:
- The interval between an event’s scheduled time and completion (not just dispatch or even execution) of its handler has a finite predetermined bound X. Therefore, the handler for an event scheduled for time T+X should not run before the handler for an event scheduled for time T.
We’re not dealing with real-time systems, so in a sense the discussion should simply end there. However, let’s say just for the sake of argument that we have defined upper bounds for the dispatch-to-execution delay and the event-handler execution time. If we want to ensure ordering of events we have to separate them by at least the dispatch-to-execution delay. In addition, if the events are being scheduled from different contexts and either of those contexts is itself an event handler, we need to increase that difference by the event-handler execution time. If we don’t account explicitly for both of these factors, we should not be surprised when events are handled out of order – even in a real-time system.
As you can see, dealing with time properly is a pain. It’s even worse in a distributed systems. In the hope that it will save someone else from weeks or even months spent pulling their hair out dealing with time-related bugs – as I used to do before I “saw the light” – I offer the following suggestions for protocol designers:
- Never rely on absolute time. There’s effectively no such thing in a distributed system anyway.
- Keep your dependencies on relative time (e.g. timeouts) to an absolute minimum.
- Never rely on the relationships between separate intervals. That’s tantamount to relying on absolute time.
If you follow these simple rules, you’ll save yourself a lot of grief. You’ll also find that your protocol is easier to describe in terms that a protocol validator will understand, which will allow you to use such validators to avoid other kinds of bugs. It’s worth it; trust me.
The first thought has to do with the definition of a “filesystem”. I defined it, somewhat circularly, as something that answers to calls such as mount, open, read, write, stat, mmap, etc. – or equivalents for other OSes. A key part of this is that a filesystem is fully transparent and fully integrated with the OS. A file within a real filesystem is indistinguishable from any other file, and is manipulated via the same interfaces. Therefore, anything that requires you to use its own applications, libraries or methods, that requires you to recompile or relink in order to use files stored therein, is not a filesystem. Ext2fs, NFS, Coda, and ISO-9660 are filesystems. Napster, Gnutella, Freenet and Mojo Nation are not filesystems. It would be possible to implement a filesystem interface to something like Freenet or Mojo Nation, although doing single-byte writes to a system based on a write-once file-granularity paradigm raises some serious performance and consistency issues. If this were done, one could call Freenet or Mojo Nation a filesystem (a sucky one), but it hasn’t been done yet and until it’s done they’re not filesystems.