Java vs. C++

This was a response to Ian’s Progluminators journal entry, which is a spirited defense of the decision to use Java for Freenet and is mostly directed at the people who keep suggesting a rewrite in C++. Rewrite yes, C++ no.

I don’t particularly have a problem with using Java, just with using Java where it’s not appropriate. If you need to write something that has good and predictable performance, Java is the wrong choice – not because it’s interpreted, that’s a myth in the modern world of JIT compilers, but because of its voracious appetite for memory and its thoroughly non-deterministic garbage collection (an implementation artifact, but apparently a universal one). If your code might ever go into the kernel, or has to interface closely with something in the kernel, or even has to interface closely with some preexisting bit of native code in user space (we don’t all spend our lives reinventing the wheel), Java is the wrong choice. I’ve been on projects where we had to deal with a big glob of Java interfacing with another big glob that had to be native code, and it’s very ugly. No, JNI doesn’t cut it. If none of these concerns apply to your project, then go ahead and use Java. For that matter, in those environments you could just as well use Python or LISP or any of the functional languages and get even more of what Java (the language, not the library collection) claims to offer. There’s a criticism of C++ that IMO might also apply to Java: for everything it does, and for every potential use, there’s something else that’s even better.

Not that I’m a C++ fan. I tend to do most of my work in C because I have to, and when I don’t have to I jump up to Python (which interfaces with my C code much better than Java does, and doesn’t have a visible compilation step). There’s certainly a lot of ugliness in C++ but I can live with that. What drives me absolutely bonkers is the way people use C++. Almost every full-time C++ programmer seems to feel this bizarre compulsion to use every single feature all the time. Everything has to be a template, everything has to be overloaded, everything has to use “functors” and singleton idioms lifted out of some design-patterns book without a jot of understanding. I can deal with arbitrarily complex idioms and styles and APIs when I need to (there are some pretty hellish APIs in any OS kernel, of necessity) but I don’t appreciate it when code is obfuscated just for the sake of showing off what a C++ guru the author thinks he is. STL, which is to a large degree the source of this madness, has rotted a whole generation of programmers’ brains. The first several times I saw this phenomenon I just thought that particular programmer was a dork. I’ve seen it enough times now, though, that I think there must be something inherent in the language that activates the dork gene. The people who resist the temptation to write awful C++ code are the exception, not the rule. I’d gladly hire people with such strength of character…to work in some other language. For all of the problems I just mentioned with Java, I’d rather deal with that than with the garbage that most C++ programmers spew into a project.

Losing our lead?

One of the things I like to do, even when I’m not actually thinking of buying a new computer, is keep track of what’s out there and what I would buy if I needed a new one. Ever since I bought my Shuttle SV24 I’ve had a particular interest in small-form-factor and quiet PCs, so I’ve been watching developments in this space quite closely. The manufacturers in this space often don’t sell directly to consumers, and sometimes they do a poor job of identifying who does, so I head over to Google to find vendors and check prices. After all, the coolest component or system in the world becomes less interesting if it’s drastically overpriced.

Lately I’ve been noticing that a lot of the newest stuff, especially in the SFF/Quiet PC space, is available elsewhere before it’s available in the US. Often the stores are in Asia, close to the manufacturers, but a surprising number seem to be in the UK. For example, mini-itx.com quotes prices in British pounds. I’ve noticed that new celphone and PDA models are almost invariably available in Japan before the US, often in other Asian and European countries too. Is the US no longer the market of choice for introducing innovative high-tech products? Is the rest of the world starting to view us as technological conservatives, unwilling to try something new? That prospect worries me.

Bottomquark hacked

You might not want to follow the next link. This morning I found that bottomquark had been hacked by a bunch of Brazilian punks. Here’s a local copy of the result, which I’ve checked to make sure doesn’t contain anything nasty; the version that’s up there right now is OK, but who knows what the cretins who hacked the site might do to get more attention when the first buzz wears off?

Bottomquark is obviously a “target of opportunity” for these jerks. It’s a generic science-news site, without any particular political flavor. The other thing I find interesting is that the hack actually includes an email link…to an address at linuxmail.org. Unfortunately, that association between the open-source community and the script-kiddie community no longer comes as a surprise.

Flat(ulence) Tax

No, really. Those Kiwis truly are crazy.

Crazy Theories

First, the geek theory. Everybody in the UNIX/Linux world has by now heard about SCO claiming bits of their copyrighted code have made their way into Linux. What if it’s not a case of SCO being the source and Linux being the destination? What if the code is held in common because they both got it from the same place (e.g. BSD)? The juiciest variant would be the one in which SCO themselves violated some license of copyright (e.g. GPL) in obtaining the code, but Linux has it fair and square. Thousand of open-source programmers and other SCO-haters would die laughing.

Now for the political theory. What if the reason we can find neither weapons of mass destruction nor evidence of their destruction is that they were sold, not recently but long ago, to a powerful neighbor or distant superpower? What if they show up not in Syria where they were moved in panic this year, but in Pakistan where they were sold a decade ago? Would that resolve the apparent “now you see them, now you don’t” conundrum?

Philosophy 101

Here’s a quote from another well-known weblog:

If you are nice to someone, or behave positively and pleasant towards them, there is a 50/50 chance that they will respond in a similar fashion.

If you act negatively, or are hostile or destructive towards someone, there is a higher likelihood that they will respond similarly.

Do you think it’s true that people are more likely to reciprocate negative behavior than positive? Is the answer different on the net than it is in real life? I’m curious what my readers think.

Threads vs. Events, Round 5

It looks like there are several good papers at this year’s Hot Topics in Operating Systems conference. One particularly interesting item, because it’s a topic that used to be discussed frequently on this site (it even has its own archive section) is von Behren, Condit and Brewer’s Why Events Are A Bad Idea. They raise many good points about the basic duality of thread-based and event-based systems, and about the various minor advantages of each.

One part of the paper which especially caught my attention was as follows. Haboob is a web server based on SEDA, while Knot is one developed by the authors.

Haboob’s maximum bandwidth of 500 Mbit/s is significantly lower than Knot’s, because Haboob becomes CPU limited at 512 clients. There are several possible reasons for this result. First, Haboob’s thread-pool-per-handler model requires context switches whenever events pass from one handler to another. This requirement causes Haboob to context switch 30,000 times per second when fully loaded – more than 6 times as frequently as Knot. Second, the proliferation of small modules in Haboob and SEDA (a natural outgrowth of the event programming model) creates a large number of module crossings and queuing operations.

Consider, if you will, a couple of statements from my own server-design article:

Note that, in this model, queuing of requests is done within stages, not between stages. This avoids the common silliness of constantly putting a request on a successor stage’s queue, then immediately invoking that successor stage and dequeuing the request again; I call that lots of queue activity – and locking – for nothing.

SEDA’s one significant flaw, in my opinion, is that it allocates a separate thread pool to each stage with only “background” reallocation of threads between stages in response to load. As a result, the #1 and #2 causes of context switches noted above are still very much present.

In fact, the latter comment is merely an echo of something I said almost a year ago:

Message passing systems are only immune to the context-switch problem if they run everything in a single thread, which is to say that they’ve traded away multiprocessor scalability for simplicity and single-thread performance. Message passing systems that use multiple threads – yes, even staged systems like SEDA – are just as prone to excessive context switches as any multithreaded program ever was.

It’s always nice to see one’s views on such a contentious subject validated. If I’m lucky, I might even get to thank the authors in person soon.

Out of the Closet

My employer finally came out of stealth mode yesterday, so I can finally talk a little bit about what we do. What we provide is an “appliance” that plugs into a storage network in front of your old disk array, and gives you the ability to turn back the clock on your storage to any arbitrary point in the past. This differs from the snapshot products everyone already has, because it doesn’t require that you had the foresight to do a snapshot just before your database went nuts and messed up all of your data. You can always restore to just one minute before, without needing an omniscient snapshot strategy. Furthermore, restoration is instant. Sure, we’re still doing stuff behind the scenes for a while but, as far as anyone in front of us is concerned, every block on that volume just went back in time. Lastly, we store all that old data in a way that’s very space-efficient. We’ve worked with some Big Brains on how best to do this, and the result is a huge improvement over snapshot or backup non-solutions that require anywhere from 2x to 6x your original dataset size to get even less functionality.

So, what do I do? I’m one of three architects, with a particular responsibility for what we call the “platform”. There are six other people on my team, and I consider it my responsibility to keep them focused on doing the real work, so I get to spend way too much time in meetings or doing schedule stuff. I’ve even had to do slides. That doesn’t mean I’m uninvolved in the technical stuff, of course. If you’ve ever tried to stay on top of the technical issues affecting six hyper-productive developers, plus interactions with stuff going on in other groups, you know it’s not something even any techie could do, let alone some empty suit. I also have direct code-delivery and debugging responsibilities, though I have to admit I’ve been falling behind because of all the other stuff. Some of the people in my group would be hard to keep up with no matter what. We’re responsible for all of the drivers and OS interactions, plus general movement of data and messages between multiple independent nodes within the system. My own particular responsibility has to do with detecting and recovering from internal failures, without affecting the service we provide to users.

Obviously, this is all pretty intense. I’m working with what amounts to an all-star team, on some pretty interesting hardware that we’re using to do some pretty cool stuff. The reactions from people who’ve heard what we’re doing has also been very gratifying. I’ve worked at places where “what makes you so special” is a common question, but nobody is asking us that. People literally ask “can you really do that” and then start thinking up new ways to take advantage of our stuff. As our marketing VP put it, you often get to choose between technical risk and market risk. We’ve chosen a large technical risk in return for almost zero market risk. We all believe technology in general has advanced enough to make this possible, and that we’re the right people to do it first. There’s a lot of work involved but no real magic, and if we can get the technical stuff done the other stuff will practically take care of itself (as much as is ever the case). It’s a great environment to work in. We have good people doing that other stuff, so we geeks can just concentrate on the stuff we do best. And yes, we get to throw nerf footballs at each other while we’re doing it. All work and no play makes Jack burn out before the product’s out the door.