College Grads’ Expectations

There’s a story on Slashdot about kids coming out of college with unrealistic expectations of what they’re jobs will be like. I don’t actually see a lot of that in the people I interview, but I sure see it in many of the responses there. Many posters talk about how they got to do much more advanced stuff in school than in their first jobs, and how it’s such a waste when employers don’t let them keep use all of those skills. Here’s my own attempt at attitude adjustment.

Yes, a lot of students get to do very advanced work . . . within an educational context which is not (and should not be) the same as the business world. Many times, that advanced work (at least at the undergraduate level) consists of putting together ideas – or often actual code – that were provided by someone else, to prevent the lower-level details from diluting the higher-level lessons. That’s fine and even necessary in that context, but it’s utterly unlike the life of a real programmer. It’s entirely possible to look around yourself and think you’re a better programmer than all these other schleps, but that perception is often false because you haven’t even learned what’s important yet. You might actually be smarter than the person interviewing you, you are almost certainly the recipient of a better and more current education, but you are almost certainly a far worse programmer than they are despite all of that stuff you did in college.

When you code for a living, those low-level details I mentioned are no longer distractions that you can or should ignore. They’re things you have do do yourself, and do right, and keep doing over and over until it’s habit because one slip can bring an entire project to a screeching halt and that has much more severe consequences than some little two-person project for 10% of one grade in school. Documenting everything is part of the job. Testing it yourself – not relying on some TA to do it – is part of the job. Integrating with the local build/installation system is part of the job. Tracking schedules and bugs is part of the job. Real code doesn’t just get run once for grading and then thrown away, either. It gets run again and again, in many configurations including invalid ones put together by idiots and crazy people. As a result, another part or your job is to make things robust, and to make sure that even the failure cases can be diagnosed, and to help bail out users when they still manage to break it in the field. Lastly, interviewing new hires is part of the job, and once you start doing that you’ll see how it’s possible for someone to have done advanced work and still be utterly unprepared for real work.

Let’s talk about travel a bit, too. You might travel as part of your job. It will probably be to less interesting places than you thought, on a tight schedule and a tighter budget. You won’t be flying first-class at convenient hours, talking strategy, eating at a fine restaurant with the customer, and then taking time off to see the sights. More likely, you’ll be flying coach and taking red-eye flights. You’ll spend your entire time in a cube or a lab, with a minder who clearly wants to be elsewhere but will nonetheless hover over you ready to pounce on every typo or pause to consider options as evidence that you’re an idiot. You’ll be working on some crazily misconfigured system, trying to finish your install or fix some bug and get out as possible both to be away from them and because you know work is stacking up back home and you’re still accountable for getting it done on time despite this days-long interruption. Even if there are any sights worth seeing, you won’t have time, and if there are any restaurants worth eating at you’ll be doing so at your own expense. Unless you can tag along with the local sales guy – in which case you have to be affable without being talkative because you have no idea what they’ve told the customer before and going “off message” in even the tiniest or least obvious way can screw everything up – you’ll be lucky to stay one step above airport food.

Have I made it sound unappealing enough yet? Well, try this: in a down economy, you’ll be competing against people who were the best and brightest of their own classes at their own schools – you didn’t think comparisons against only your immediate classmates meant anything, did you? – and who have a decade of directly relevant professional-level experience as well. Just about the only thing you’ll have going for you is lower salary expectations, so lower them now. Don’t think you’ll avoid all that by starting your own company, or joining a friend’s, either. Most of what I’ve said above still applies, and the same bad economy has made funding tight even for people with established track records let alone for Some Random Kid. I’ve written before about the good aspects of a programming job, but you can’t have all that and a high salary and lots of ego-strokes all at once, right off the bat. With a few years under your belt you can be more assertive, but until then it’s not unreasonable for people to demand that you prove yourself. It’s not hazing and it’s not elitism, though. A lot of very smart kids, with plenty of advanced work to show how smart they are, still wash out. As long as you and the set of people indistinguishable from you collectively represent a high risk, employers will quite rationally balance high risk with lower pay and less responsibility. In the next few years, as the risk falls because the wash-outs wash out and the survivors survive, then both pay and responsibility will increase rapidly. You can bank on it, you can demand it when the time comes, but you can’t really expect anyone to pay top dollar for an unproven worker. You wouldn’t want to work for anyone that crazy.

Death To Blue LEDs

I was just talking to a coworker about this, and realized it was worth a post. Better yet, it’s worth a picture.

I am bleeping sick of bright, flashing blue LEDs on just about every piece of electronic gear. One of the worst offenders is a Netgear WNR854T wireless router/AP that I bought recently. One of the activity lights is blue. It flashes a lot. It flashes even when there’s no reason to believe that there’s really any of the traffic that it supposedly indicates. It’s so bright that it casts a shadow all the way down the hall in the middle of the night – a flashing shadow, while I’m trying to go downstairs for a drink without turning on the hall light. Of course, if I look for even one moment at the light that’s casting the shadow – as everybody’s brain is hard-wired to do – then I’m blinded because blue is the worst color for destroying night vision. It took me all of one night to tape cardboard over that, but then there’s the (fortunately less bright) blue LED on the NAS box next to it, and the two blue LEDs on the computer next to that (which is usually off), and way too many others.

Manufacturers: stop putting bright blue LEDs on every darn thing, especially ones that flash. Tone them down (resistors are cheap) and/or use more reasonable colors. Blue LEDs were cool half a decade ago, but now they’re just annoying.

Going Galt

Back in the darkest days of the Bush administration, some liberals mused about moving to Canada or Europe. The response from the right side of the blogosphere was unanimous: go ahead, get out, good riddance. Now the shoe’s on the other foot. Many right-wingers, particularly of the libertarian persuasion, are talking about moving elsewhere (it’s never clear where) as a form of protest against the “socialist” Obama administration. They call it “going Galt” in homage to a character in Ayn Rand’s Atlas Shrugged. Of course, none of the people talking like this are in any way comparable to Galt. The real inventors and entrepreneurs who actually create something of value are too busy, well, creating things of value to waste time on such talk. The talkers are mostly bloggers and media types, people who think that serving a few web visitors’ recreational-reading needs makes them exemplars of economic productivity. Truly, they are legends in their own minds. To see where they’re drawing their inspiration, one need look no further than Jeremiah Tucker’s parody Atlas Shrugged, updated for the current financial crisis.

Galt went on like this for what seemed to Dagny like hours, until, finally, something he said piqued her interest.

“And that’s why I created the financial plan you found. It’s true, it works. But it is not sustainable. It will ruin this country’s financial system, and then we’ll see how those who despise us prosper when their lenders and investors refuse to invest or lend.” He laughed joylessly. “Funny, isn’t it? I must destroy the very thing I love in order to save it.”

“Just to avoid paying taxes?”

Tucker has succeeded in capturing not only the wooden awfulness of Rand’s prose, but also the more fundamental stupidity of the views she has her characters represent. The thing about markets and economics is that they’re all about connectedness. Even if all the bloggers who talk about “going Galt” really were the paragons of productivity that they imagine themselves to be, they’d still need somebody to buy what they create, and sometimes to provide essential services as well. The market of creative titans is too small to thrive by itself, cut off from the much larger market of everyday people buying things to satisfy their everyday needs. Rand never writes about Dagny Taggart cleaning toilets or Hank Rearden cutting people’s hair, but that’s what would have to happen if they were truly to isolate themselves from the rest of society. Well, I guess that they could also resort to slave labor to stay isolated without having to do all these things themselves, and it’s an option I’m sure many real-life Randroids would turn to if they thought they could get away with it (which they practically already do when it comes to third-world labor), but it’s starkly inconsistent with the freedom they demand for themselves so I’m going to dismiss that possibility.

Of course, all the “going Galt” folks understand the importance of large markets already. That’s why none of them are actually following through on their threats/promises to withhold their oh-so-valuable services from the rest of us. They know that if they were really to do that, they’d be impoverished and nobody else would even notice. Before long they’d come crawling back, perhaps wiser but at least less strident. Oh, how I wish they really did have the courage of their convictions.

UPDATE: after I wrote this but before I posted it, hilzoy examined the same phenomena and drew approximately the same conclusions.

Another bash Puzzle

A while back, I wrote about some pitfalls of bash programming. Recently I tripped over a particularly insidious one, which seems worth sharing. Consider the following fragment.

echo foo | while read x; do
echo $myvar

Obviously the real example did something more complex, but this example is easier to write about. Now consider the following superficially-equivalent fragment.

while read x; do
done < <(echo foo)
echo $myvar

Can you predict, without running them, what each output each fragment will produce? It's easy to predict that they're different, or else I wouldn't be writing about them, but which one prints "after"? And why? It turns out that it's the second one. Why doesn't the first? Why doesn't the assignment within the loop seem to have any effect?

The answer is that nobody can reasonably predict when bash will spawn a subprocess to handle some part of a pipeline. If the loop in one of the examples is executed in a subprocess, then the assignment happens in that subprocess and - most egregiously of all - is not propagated back to the parent. Crazy, huh?

What's ironic about this, in retrospect, is that the one response to my previous bash article proposed exactly the pattern of the first example, subject to the same subtle failure to do what any programmer writing it would clearly have intended, for purely aesthetic reasons. That only reinforces my early point that bash almost seems to go out of its way to trip up even the most skilled and diligent programmers.

UPDATE: here's another example that works, and is cleaner in some ways but involves some even more obscure bash features. It's kind of a way to do coroutines in bash, which is an idea that sane minds would shun.

exec 6< <(echo foo)
while read x 0<&6; do
echo $myvar

Tuesday Pictures

Cindy has been sick a lot, so I’ve been spending more time than usual with Amy. On Friday I got to take her to swim class, which meant dealing with the awkwardness of taking a girl through the boys’ locker room. On Sunday we went to the New England Aquarium. It was mostly Amy’s idea, actually. Somehow we’d been talking about the aquarium we went to in Myrtle Beach, and I mentioned that there was one in Boston. The rest, as they say, is history. Her favorite exhibit was the penguins, especially diving and pooping (it’s a pre-schooler obsession). Anyway, here are some pictures of her and of other things from the last couple of months.

A gingerbread-house kit which finally got put together in February.
The view from my hotel room on my last trip to Boulder.
Amy at the jelly (not jellyfish because they’re not fish) exhibit.
My favorite – leafy sea dragons. There are actually three in the picture; two greenish-gold ones at the top and an even better-camouflaged one in the middle.
Amy and the shark (not sure what kind).

Earmarks and David Cay Johnston

I don’t have a problem with earmarks. Yes, that’s right. I know that earmarks are the bête noire du jour, but I think they’re fine. I do have a small problem with earmarks that are completely unrelated to the bills they’re attached to, and I have a big problem with earmarks that are mere patronage and graft rather than a productive use of taxpayer money, but those are separate issues. The basic idea of congress expressing its specific will for how a particular piece of public money should be spent, instead of just handing out buckets of cash and letting some less-accountable entity within the executive branch decide how to spend it, seems quite fine to me. To pick the most obvious current example, if the purpose of a bill is to stimulate the economy, then any number of earmarks identifying congressional intent as to what kind of stimulus should be applied and where would be fine. It sure beats the Bush administration approach of handing boatloads of cash to a few banks with no transparency or accountability whatsoever. Anybody who’s whining about earmarks now who was silent about no-bid government contracts last year is a hypocrite.

On a mostly different topic, if you want to read something more about these kinds of subjects, I highly recommend Amy Goodman’s interview of David Cay Johnston about our corporate taxes, stimulus packages, and all sorts of other stuff in between. If you actually believe all that malarkey about how rich people and corporations pay too much tax already, you owe it to yourself and your society to read DCJ’s book Perfectly Legal: The Covert Campaign to Rig Our Tax System to Benefit the Super Rich–and Cheat Everybody Else.

Scalable Flow Control

One of the problems that seems to occur again and again when “computing at scale” is some sort of server getting overwhelmed by requests from relatively much more numerous clients. Sooner or later, every kind of server has to deal with running out of buffers or request structures or (if the designers were fools) threads/processes, or just about any other kind of resource. Having the entire system grind to a halt because one server couldn’t handle overload gracefully and instead died a messy death is both unpleasant and unnecessary. It’s unnecessary because there are solutions that work. Unfortunately, there are other solutions that don’t work, but get tried anyway because they seem easier.

One such non-solution is to have lots and lots of servers, and spread the load between them as evenly as possible. Most people who have actually tried this have eventually realized that spreading the load that well is very hard if not impossible. Sooner or later, an access pattern appears that causes one server to get overloaded. Then it fails, increasing load on its peers (not just the shifted operational load but now the recovery load as well) and quite likely causing them to fail as well, and so on. If this sounds a bit like the northeast US power blackout in 2003, it should. Even if your load-balancing is really good and you’re committed to running your servers at 10% of capacity, a physical or configuration error could leave you with this sort of imbalance/failure cascade. The solution is to handle the condition, not avoid it, and that means some form of flow control. In other words, you have to make requests queue at the (more numerous) clients instead of within the servers.

Flow control can be implemented in many ways. It can be implemented at a low level to maximize generality and code reuse or at a high level to maximize efficiency and applicability to all kinds of resources. It can be implemented via credits that clients must hold or obtain before sending a request, or via “slow down” messages that are sent from servers to clients only when needed. The preachers of statelessness would say that the latter approach is less stateful and therefore preferable, but I think they’re mostly deluding themselves. For one thing, a lot of “stateless” servers have really just moved their session-layer state somewhere else (e.g. a database maintained by the application layer) instead of truly eliminating it. For another, the state after receiving a “slow down” message is still state that must be maintained. If clients can simply ignore such a message, or “forget” that they got it, you’ve achieved nothing whatsoever. If they’re bound to respect it, and especially if the server attempts to enforce it, then you’re just as stateful as you would be with full credit accounting but limited to only zero or one credit.

So, if you’re going to use a credit approach, how should credit be allocated? Again, many will be tempted to use non-scalable approaches. Often the easiest thing to do is to allocate a worst-case set of resources (and associated credit) to every client. That can work OK with few clients, but rapidly leads to unacceptable levels of resources being allocated but idle when the node counts stretch into the hundreds or thousands. The opposite end of the spectrum is to require that all resources and associated credit be explicitly obtained, but this can lead to unacceptable first-request latency and performance anomalies as each batch of credit is consumed. In my experience, a hybrid approach works better: preallocate just enough resources and credit for each client to keep it happy while it explicitly requests and obtains more from a common pool. The common pool can then be large enough to satisfy the maximum worst-case load for the entire system, which is often much less than the sum of the worst-case numbers for each node, allowing support for many more clients at the same resource level. Also, the allocation requests and replies can often be piggy-backed on other messages so they carry little additional cost.

The one remaining problem is how credit gets returned to the common pool when a client no longer needs it. This can be driven either by the client (when it recognizes that it no longer needs the resources) or by the server (when it needs to replenish the common pool). Since it’s generally hard to tell when a client doesn’t need credit any more, the client-driven approach usually involves giving up credit after a timeout. The server-driven approach, on the other hand, requires implementation of a credit-revocation exchange parallel to the credit-granting one. It’s even possible to combine approaches, and in fact I usually do so that a client might give up credit either on its own initiative or in response to a server message while reusing most of the same code for both cases.

With this kind of scheme – small amounts of per-client credit plus explicit requests and revocations of any credit over that amount, with credit-level changes potentially driven by either side – it’s possible to avoid server overload without either starving clients or wasting server resources. It’s not really as complicated as it might sound, and can be implemented with a negligible impact on common-case performance.

Cloud Computing and Teenage Sex

Don’t blame me for the comparison. It’s actually Walter Pinson‘s.

It was once said back in the early ‘90s that “Client/server computing is a little like teenage sex – everyone talks about it, few actually do it, and even fewer do it right. Nevertheless, many people believe client/server computing is the next major step in the evolution of corporate information systems.”

Can the same be said about cloud computing, today?

I contend that cloud computing is like teenage sex in another way: teenagers act like they invented sex, annoying their elders who thought that they invented it back when they themselves were teenagers. As Pinson’s reference to client/server computing makes clear, there’s a lot about cloud computing that’s not new. There are even aspects that go back even further. When people talk about how to bill for cloud computing, or how to insulate users from one another, it all starts to sound a lot like the old time-sharing days. It’s time-sharing on a new kind of system, but it’s time-sharing nonetheless.

There are people creating new technology in the cloud computing space, to be sure. (This is where the teenage-sex analogy breaks down.) I used to be one of them, and might be again in the not-too-distant future. There are far more people merely reinventing old technology in the cloud computing space. If anyone really wants to understand cloud technology and how it might best be deployed to create value, I think it’s important to understand which parts are actually new and how they’re new vs. what parts have already been done or tried.

P.S. While we’re talking about cloud analogies, Bruce Sterling had another good one.

Okay, “webs” are not “platforms.” I know you’re used to that idea after five years, but consider taking the word “web” out, and using the newer sexy term, “cloud.” “The cloud as platform.” That is insanely great. Right? You can’t build a “platform” on a “cloud!” That is a wildly mixed metaphor! A cloud is insubstantial, while a platform is a solid foundation! The platform falls through the cloud and is smashed to earth like a plummeting stock price!

There’s a lot of other randomness in there too, but the fiction author’s comparison of cloud-computing fiction to financial-market fiction is worth thinking through.