Future Projects

I’m going to have two weeks off between jobs, so I’ve been thinking about how I’m going to use the free time. I’ll probably spend a lot of it just kicking back and relaxing, finishing my game of Dungeon Siege, playing chess and chatting online, maybe watching some bad movies while Cindy’s not around. No, not that kind of bad movie! Sheesh. I mean movies that have too much violence or stupid humor or funky camera work that would just not be interesting (or watchable) for her. I’ll probably spend some time catching up with friends over long lunches, maybe take some nice long walks, work out every day to get myself back in shape, etc.

Yeah, that’s all nice, but probably not very interesting to anyone but me. The interesting part is that I might also spend a few days working on various technical projects that I haven’t had time for. Here are a few that are near the top of my list.

  • Getting this website transferred to a real weblog/comment system. I’ve had a lot of fun rolling my own, and I could integrate the content management with the comment system myself if I really wanted to, but I sort of think that project has run its course. Whatever package I use (currently leaning toward b2 but I’m open to suggestions) will involve me writing some scripts to import my existing content, and maybe some hacking to add specific features that I want (like the tunable RSS generation). It’s not rocket science, but it’s not trivial either.
  • A filtering email proxy. Partly I’d like to do this to illustrate the principles in my server-design article (now linked from the front page) but it’s also something I’d really like to have. I’ve been using Proxomitron for a while, and I really like having the ability to rewrite – as opposed to just throw away – dangerous or objectionable content, but I haven’t seen anything I like that does the same thing for email. I’d like to build something with a plugin interface, and possibly write a plugin that interfaces with Vipul’s Razor just for extra fun. Between regexp-like rewriting rules and plugins, such a tool could do everything from virus protection to stripping images.
  • I still have lots of ideas about distributed storage, most in very different directions from my last project so that I haven’t been able to pursue them. That’s more than a two-week project, though, so I’ll probably do more thinking and writing than actual coding.

I know there are many more things I’ve talked about at one time for another, but I’ve forgotten most of them. These are the big ones for now, though, so if you have any suggestions that would help me along feel free to send some email.

Growling Platypus

I finally found an actual recording of a platypus!

Tunable RSS

At Ziv Caspi’s request (sort of) I’ve added a way to tune the length of items in my RSS feed. All that’s involved is passing a number to the script that generates it, as follows:

Enjoy. I sort of wish more RSS feeds worked this way, since it’s kind of wasteful to send an entire article only to have the client throw away most of it.

Random Amusements

Some people go for light summer reading. I go for A Problem From Hell. Nothing quite like sitting by the pond in my swimsuit, reading about the Khmer Rouge genocide. Am I brain damaged, or what?

Perhaps to make up for the heavy, serious nature of the above tome, I also spent a good chunk of my weekend playing Dungeon Siege. It’s an amazingly large game. Right now I’m looking for something in a swamp. I don’t actually know if it’s even in the swamp, and the freedom with which I can roam around is actually a little bit disorienting. The map facilities are pretty seriously deficient, so it can be hard to remember where you’ve already been or which way to what. I’ll have to start drawing maps by hand just to keep track, and that sucks all of the spontaneity – one of the game’s strong points – out of things. Sigh. It’s amazing how one flaw can detract so much from an otherwise enjoyable game experience.

Reduce This

Is it just me, or does the legal phrase “reduced to practice” – often appearing in patents and employeee contracts dealing with so-called intellectual property – seem backward to others as well? Does the existence of an actual implementation reduce or diminish the original abstract idea in some way? Au contraire, says I. Implementation validates the original idea, and often extends it as well. They should say the idea is “advanced to practice” instead, in my opinion.

Virtual Machines

David McCusker quotes Pierre Phaneuf (probably without permission):

one of my pet peeves with virtual machines is that their opcodes are often too “small” (they don’t do enough), leading to things having to be done in the bytecode, where if it could have “larger” opcodes that did more, more time would be spend in the efficient, possibly optimized, native code of the virtual machine implementation itself.

My first reaction was, “But that sounds like CISC, and we all know RISC is better!” Then I got to thinking, and I’m not so sure. Consider that the justification for RISC lies in the following formula:


	work/second = work/instruction * instructions/cycle * cycles/second

As elucidated in Hennessy and Patterson, the idea behind RISC is to enable great increases in the last two factors, for only a modest decrease in the first, yielding better overall performance. The quantitative reasoning they use to make the case is exemplary not only in the context of computer design but also as an exercise in how to reason about performance in complex systems, which is why I think even software engineers should read the book.

The problem is that the idea of cycles is central to the formula, but it’s almost meaningless in the context of a virtual machine (or asynchronous logic, but that’s a whole ‘nother article). The two factors that RISC attempts to increase collapse into one – instructions/second – and Pierre rightly points out that it’s not improved by using smaller/simpler instructions. Making all virtual instructions execute in the same amount of time is not useful, and issuing virtual instructions in parallel is generally not feasible. If you’re interpreting virtual instructions, maybe the RISC approach really doesn’t make sense.

On the other hand, how much sense does it make to be worrying so much about performance of interpreted code anyway? If you really want performance, you should be incrementally compiling those virtual instructions into real instructions for the processor you’re running on anyway. It is therefore the expressiveness of the virtual machine language that matters, not its performance when interpreted. That still tends to justify a “higher semantic level” approach which is roughly analogous to CISC, but IMO it’s a better justification than the “virtual performance” concerns discussed by Pierre and David.

Tarnished Silver Bullets

AccordionGuy quotes Jamie Zawinski, by way of Mark Pilgrim (enough attribution, already):

Some people, when confronted with a problem, think “I know, I’ll use regular expressions.” Now they have two problems.

and then adds

You could also replace regular expressions with Perl and the sentence would still be correct.

Oh, the possibilities are endless! What other replacements would work?

  • A new law
  • A more rigorous process
  • Six more engineers
  • XML
  • Emacs
  • A honkin’ big disk array or NAS box

About the only thing that doesn’t fit is chocolate. You just can’t go wrong with chocolate.

As Promised

OK, now that it’s official I can write about it here. I’ve given notice at EMC, and will shortly be going to work at mumblemumble. I can’t really say anything about what mumblemumble does, because they’re all early-stage and stuff, but there are some things I can say:

  • I’ll have a really short commute (across town).
  • I’ll be working with some really cool people that I know from previous jobs.
  • The project is technically interesting.
  • The eventual product is one that I believe a lot of people will really find useful.

I’m excited, of course, but also a bit stressed. Considering that I’m not a big-company guy and I came to EMC “through the back door” via acquisition, it amazes me to realize that I’ve worked there longer than anywhere else. A lot of people have been telling me for a long time that I was overdue for a change. Well, now there’s change all right. ;-) Only time will tell whether it’s the kind of change I really needed, but for now I’m excited and optimistic and a little impatient to get on with it.

Turn and Face the Strain

Well, this cartoon obviously wasn’t written with me in mind. I’m sorry about the recent lack of updates, folks. Since the shakeup last week I’ve been kind of…shaken up. I’ve been tempted to do more “thinking out loud” here, but I don’t want to seem unprofessional, I don’t want to violate anyone’s privacy or confidentiality, and I don’t want to look back a year from now and marvel at what a jerk I was, so I’ve held myself back. I know that’s not the blog way, but part of my writing style is to wait until my thoughts on whatever topic have stabilized into a somewhat coherent form, and that hasn’t been happening lately.

On the other hand, watch this space. I can’t talk about it right now, but there’s some stuff going on that should lead to a fairly significant update here, followed by generally increased volume for a couple of weeks, possibly followed by another period of relative silence as my stock of free time fluctuates. Yeah, I’m sure about half of you know exactly what that means, but I want to torture the other half a little. ;-) Stay tuned.

Dijkstra Considered Beneficial

Edsger W Dijkstra, he of structured programming and semaphores and Go-to Statement Considered Harmful and other contributions too numerous to mention, has died. I’d like to suggest that everyone read at least the two papers linked above, and ideally more at the same site. Also, remember to thank him the next time you use a semaphore in one of your programs, because they were more his idea than anyone else’s.