Using WordPress to Generate Static Pages

As you all know by now, I’ve changed the way I manage the content for this site. I now write posts in WordPress, then turn the results – after all of the database access, theme application, etc. – into static pages which are then served to you. One of my main reasons was increased security, which has been a problem with WordPress for years but which has generated a lot of interest recently because of the latest botnet. Therefore, I’ll describe what I’ve done so that maybe others can try something similar and maybe even improve on my recipe.

In order to understand what’s going on here, you have to know a bit about how a WordPress site is structured. It might surprise you to know that each post can be accessed no fewer than six different ways. Each post by itself is available either by its name or by its number. Post are combined on the main page, per-month lists, and per-category lists. Lastly, the category lists are also reachable either by name or by number. In fact, if a post is in multiple categories it will appear in more than six places. It’s important to preserve all of this structure, or else links will break. This is why I didn’t use a WordPress plugin to generate the static content, by the way. Even in the most cursory testing, every one I tried failed to maintain this structure properly. Most of the work I had to do was related to getting that structure just right, but first you have to generate all of the content so I’ll start there.

The basic idea for fetching the content is to crawl your own site with wget. Start with a normally working WordPress installation. Make sure you’ve set your WordPress options to use a name-based (rather than number-based) URL structure, and turned comments off on all posts. Then issue something like the following command.

wget -r -l inf -p -nc -D atyp.us http://pl.atyp.us/wordpress

This might take a while. For me it’s about twenty minutes, but this is an unusually old blog. Also, you don’t need to do this all the time. Usually, you should be able to regenerate the post itself plus its global/month/category timelines, but not touch other months or categories. At this point you’ll have a very simple static version of your site, good enough as an archive or something, but you’ll need to fix it up a bit before you can really use it to replace the original.

The first fix has to do with accessing the same page by name or by number. One of my goals was to avoid rewriting the actual contents of the generated pages. I don’t mind copying, adding links, adding web-server rewrite rules, and so on, but rewriting content only fixes things for me. Any links from outside would still be broken. My solution here has two parts. The first is a script, which finds the name-to-number mapping inside each article and uses that information to create a directory full of symbolic links. Here it is, but be aware that I plan to improve it for reasons I’ll get to in a moment.

#!/bin/bash
 
function make_one {
	# Canned Platypus posts
	p_expr='single single-post postid-\([0-9]*\)'
	p_id=$(sed -n "/.*$p_expr.*/s//\\1/p" < $1)
	if [ -n "$p_id" ]; then
		ln -s "$1" "post-$p_id.html"
		return
	fi
	# HekaFS posts
	p_expr=' name=.comment_post_ID. value=.\([0-9]*\). '
	p_id=$(sed -n "/.*$p_expr.*/s//\\1/p" < $1)
	if [ -n "$p_id" ]; then
		ln -s "$1" "post-$p_id.html"
		return
	fi
	c_expr='archive category category-[^ ]* category-\([0-9]*\)'
	c_id=$(sed -n "/.*$c_expr.*/s//\\1/p" < $1)
	if [ -n "$c_id" ]; then
		ln -s "$1" "cat-$c_id.html"
		return
	fi
}
 
find $1 -name index.html | while read f; do
	make_one $f
done

Notice how I had to handle the two blogs differently? It turns out that this information is theme-specific, and some themes might not include it at all. What I really should do is get this information from the database (correlate post_title with ID in wp_posts), but it works for now. The second part is a web-server rewrite rule, to redirect CGI requests for an article or category by number to the appropriate link. Here’s what I’m using for Hiawatha right now.

UrlToolkit {
    ToolkitID = cp-wordpress
    RequestURI isfile Return
    # Handle numeric post/category links.
    Match /wordpress/\?p=(.*) Rewrite /wordpress/links/post-$1.html Return
    Match /wordpress/\?cat=(.*) Rewrite /wordpress/links/cat-$1.html Return
    Call static-wordpress
}

What ends up happening here is that Hiawatha rewrites the CGI URL so that it points to the link I just created, which in turn points to the actual article. The "static-wordpress" URL toolkit handles another dynamic-link issue, this time related to JavaScript and CSS files.

UrlToolkit {
    ToolkitID = static-wordpress
    # Support multiple versions of CSS and JS files, with the right extensions.
    Match (.*)\.css\?(.*) Rewrite $1_$2.css Return
    Match (.*)\.js\?(.*) Rewrite $1_$2.js Return
    # Anything else gets the arguments lopped off.
    Match (.*)\?(.*) Rewrite $1 Return
}

I had to do this because it turned out that Firefox would complain about CSS/JS files not having the right type, because Hiawatha would get the wrong type unless the file ended in .js or .css respectively. For example, widgets.css?ver=20121003 wouldn't work. This rule rewrites it to widgets_ver=20121003.css which does work. To go with that, I also have a second renaming script.

#!/bin/bash
 
workdir=$(mktemp -d)
trap "rm -rf $workdir" EXIT
 
find $1 -name '*\?*' | grep -Ev '&"' > $workdir/all
 
# Use edit-in-place instead of sed to avoid quoting nastiness.
 
# Handle CSS files.
grep '\.css' $workdir/all > $workdir/css
ed - $workdir/css < < EOF
g/\([^?]*\)\.css?\([^?]*\)/s//mv '\1.css?\2' '\1_\2.css'/
w
q
EOF
 
# Handle JavaScript files.
grep '\.js' $workdir/all > $workdir/js
ed - $workdir/js < < EOF
g/\([^?]*\)\.js?\([^?]*\)/s//mv '\1.js?\2' '\1_\2.js'/
w
q
EOF
 
# Handle everything else.
grep -Ev '\.css|\.js' $workdir/all > $workdir/gen
ed - $workdir/gen < < EOF
#g/\([^?]*\)?\([^?]*\)/s//mv '\1?\2' '\1_\2.html'/
g/\([^?]*\)?\([^?]*\)/s//rm '\1?\2'/
w
q
EOF
 
. $workdir/js
. $workdir/css
. $workdir/gen

Note that the script also deletes other (non-CSS non-JS) files with question marks, since wget will leave some of those lying around and (at least in my case) they're invariably useless. Similarly, the static-wordpress rewrite rule just deletes the question mark and anything after it.

At this point you should have a properly fixed-up blog structure, which you can push to your real server and serve as static files (assuming you have the right configuration). What's missing? Well, comments for one. I still vaguely plan to add an external comment service like Disqus or Livefyre, but to be honest I'm not in that much of a hurry because - while I do appreciate them - comments have never been a major part of the site. The other thing missing is search, and I'm still pondering what to do about that. Other than that, as you must be able to see if you can read this, the process described above seems to work pretty well. My web server is barely using any CPU or memory to serve up two sites, and my "attack surface" has been drastically reduced by not running MySQL or PHP at all.

P.S. Hiawatha rocks. It's as easy to set up as nginx, it has at least as good a reputation for performance, and resource usage has been very low. I'd guess I can serve about 60x as much traffic as before, even without flooding protection - and that's the best thing about Hiawatha. I can set a per-user connection limit (including not just truly simultaneous connections but any occuring within N seconds of each other) and ban a client temporarily if that limit is exceeded. Even better, I can temporarily ban any client that makes more than M requests in N seconds. I've already seen this drive off several malware attempts and overly aggressive bots, while well behaved bots and normal users are unaffected. This probably increases my load tolerance by up to another order of magnitude. This might not be the fastest site of any kind, but for a site that has (almost) all of the power of WordPress behind it I'd say it's doing pretty well.

Static Site is LIVE

If you’re seeing this, it’s because you’re on the new site, seeing static files served by Hiawatha instead of dynamic files served by nginx. If you notice anything else that’s different, let me know.

Static Site Update

As I mentioned too long ago, I’ve been planning to migrate this site to a different method of operation, for both performance and security reasons. Specifically, my approach allows me to add posts, change themes, etc. with all the power of WordPress and its community at hand, but then serve up the results as static pages. I have most of that working on my test site, with only two things not working: by-category listings (by-date listings work) and comments. I can actually do without comments for a while until I find an external solution that I like, but I feel like I do need to fix the by-category listings before I switch over. For the technically minded, here’s a rough outline of how I’m doing this.

  1. I have two Hiawatha configs – one for dynamic pages and one for static. These are currently on the same machine, but the plan is to run them on separate machines when I’m done.
  2. For editing etc. I just use the dynamic config and everything works just as it has for years.
  3. When I’m done editing, “wget -r -l inf -p -nc -D atyp.us” gets me a static version of the site.
  4. I also have a script to rename some files and deal with a few other site-specific issues.
  5. When I’m all done, I switch over to my static Hiawatha config, which has a couple of URL-rewrite rules to work around the CGI-oriented URLs that WordPress produces.
  6. The live site is running no PHP or MySQL, just Hiawatha serving up static files.

The key point is that this all looks exactly the same as the current site running on a standard setup, even though it’s all very different behind the scenes. When I’m done, I’ll more fully document everything and put up a how-to for other WordPress users to follow.

EDIT: Since someone else is sure to ask, I will. Why not just switch to a built-for-static system like Octopress? Here’s why.

I’d still have to convert the existing content. As long as I automate that process, it doesn’t matter much whether I do it once or many times. Even rebuilding the entire static site from scratch, which I’ve done a lot more while debugging than I’d ever do in normal operation, doesn’t take long enough to bother me. Selective rebuilds would be easy, and even faster.

  • I like the WordPress tools that I use to create, categorize, and present content. When it comes to plugins and themes, even the most popular/sophisticated static systems seem downright primitive by comparison, so I’d be back to doing more stuff by hand.
  • I’m very conservative about breaking links, and none of the static systems are fully compatible with the URL structure that I’ve been using for years.

My only gripes with WordPress are security and performance. Sure, I could make a more drastic change, and the pages would be a bit simpler/smaller if I did (even the simplest WordPress themes generate some horrendous HTML), but I’d need a better reason than that.

EDIT 2: Now that the static site is live, I no longer even run PHP/MySQL on the real server. This edit, and the next post, were added by running a copy of the site on my desktop at home and then copying only the output to the cloud.

Server Design in Serbo-Croatian

Ten and a half years ago, I wrote an article on server design. Considering that I probably worked harder on that than on anything I’ve posted since, I’m pleased that it has continued to be one of the most popular articles on the site despite its increasing age. It’s even more gratifying to know that some people are including it in their academic course materials – maybe half a dozen instances that I know of. Now, thanks to Anja Skrba, it has been translated into Serbo-Croatian. I’m totally unqualified to comment on the accuracy of the translation, but it’s a commendable contribution to the community nonetheless. Feedback on either the content or the translation would surely be appreciated by both of us. Thank you, Anja!

How (Not) To Collaborate

Collaboration is one of the most essential human skills, not just in work but in life generally, and yet it’s poorly taught (if at all) and a lot of people are bad at it. Programmers are especially bad at it, for a whole variety of reasons, and this last week has been like a crash course in just how bad. Collaboration means exchanging ideas. Here’s how I have seen people fail to participate in such exchanges recently.

  • Passive ignoring. No response at all.
  • Active ignoring. Nod, smile, put it on a list to die. This is what a lot of people do when they’ve been told they need to work on their collaboration skills, and want to create an appearance of collaboration without actually working at it.
  • Rejection. All variants of “no” and “what a terrible idea” and “my idea’s better” fall into this category.
  • “It’s my idea now.” The obvious version is just presenting the idea unchanged, as one’s own. The sneakier alternative is to tweak it a little, or re-implement it, so it’s not obvious it’s the same, but still present derivative work without credit to the original.
  • “It’s your problem now.” This is probably the most insidious of all. It presents an appearance of accession, but in fact no exchange of ideas has occurred. Just as importantly, the person doing this has presumed unilateral authority to decide whose problem it is, creating an unequal relationship.

The key to real collaboration is not only to accept a single idea itself, but to facilitate further exchange. Here are some ways to make that work.

  • Accept the context. Respect the priority that the other person gives to the idea along with the idea itself. , Assume some responsibility for facilitating it. Don’t force people to remind, re-submit or nag before you’ll really consider what they’re suggesting. Both active and passive ignoring are wrong because they violate this principle.
  • Don’t attach strings. Don’t make people jump through unnecessary hoops, or demand that they assume responsibility for more than the subject of their idea, just to have their idea considered. Obviously, “your problem now” and its cousin “you touch it you own it” violate this rule. I’ve left more jobs because of this tendency, which leaves people shackled to responsibilities they never asked for, than for any other reason. I don’t think I’m the only one.
  • Be a teacher, not a judge. Every opportunity for rejection is also an opportunity for teaching. If there’s something truly wrong with an idea, you should be able to explain the problem in such a way that everyone benefits. You owe it to your team or your community or even your friends and family to develop this skill.
  • Give credit. It will come back to you. People rarely give freely to notorious thieves and hoarders.

Note that I’m not making any appeals to morality here. I’m not saying it’s right to make collaboration easier. I’m saying it’s practical. When you make collaboration with you easy and pleasant, people want to do it more. That frees you to work on the problems that most interest you, and share credit for a successful project instead of getting no credit at all for a failed or stagnant one. When people try to do you a favor, try to accept graciously.

Is Eventual Consistency Useful?

Every once in a while, somebody comes up with the “new” idea that eventually consistent systems (or AP in CAP terminology) are useless. Of course, it’s not really new at all; the SQL RDBMS neanderthals have been making this claim-without-proof ever since NoSQL databases brought other models back into the spotlight. In the usual formulation, banks must have immediate consistency and would never rely on resolving conflicts after the fact . . . except that they do and have for centuries.

Most recently but least notably, this same line of non-reasoning has been regurgitated by Emin Gün Sirer in The NoSQL Partition Tolerance Myth and You Might Be A Data Radical. I’m not sure you can be a radical by repeating a decades-old meme, but in amongst the anti-NoSQL trolling there’s just enough of a nugget of truth for me to use as a launchpad for some related thoughts.

The first thought has to do with the idea of “partition oblivious” systems. EGS defines “partition tolerance” as “a system’s overall ability to live up to its specification in the presence of network partitions” but then assumes one strongly-consistent specification for the remainder. That’s a bit of assuming the conclusion there; if you assume strong consistency is an absolute requirement, then of course you reach the conclusion that weakly consistent systems are all failures. However, what he euphemistically refers to as “graceful degradation” (really refusing writes in the presence of a true partition) is anything but graceful to many people. In a comment on Alex Popescu’s thread about this, I used the example of sensor networks, but there are other examples as well. Sometimes consistency is preferable and sometimes availability is. That’s the whole essence of what Brewer was getting at all those years ago.

Truly partition-oblivious systems do exist, as a subset of what EGS refers to that way. I think it’s a reasonable description of any system that not only allows inconsistency but has a weak method of resolving conflicts. “Last writer wins” or “latest timestamp” both fall into this category. However, even those have been useful to many people over the years. From early distributed filesystems to very current file-synchronization services like Dropbox, “last writer wins” has proven quite adequate for many people’s needs. Beyond that there is a whole family of systems that are not so much oblivious to partitions as respond differently to them. Any system that uses vector clocks or version vectors, for example, is far from oblivious. The partition was very much recognized, and very conscious decisions were made to deal with it. In some systems – Coda, Lotus Notes, Couchbase – this even includes user-specifed conflict resolution that can accomodate practically any non-immediate consistency need. Most truly partition-oblivious systems – the ones that don’t even attempt conflict resolution but instead just return possibly inconsistent data from whichever copy is closest – never get beyond a single developer’s sandbox, so they’re a bit of a strawman.

Speaking of developers’ sandboxes, I think distributed version control is an excellent example of where eventual consistency does indeed provide great value to users. From RCS and SCCS through CVS and Subversion, version control was a very transactional, synchronous process – lock something by checking it out, work on it, release the lock by checking in. Like every developer I had to deal with transaction failures by manually breaking these locks many times. As teams scaled up in terms of both number of developers and distribution across timezones/schedules, this “can’t make changes unless you can ensure consistency” model broke down badly. Along came a whole generation of distributed systems – git, hg, bzr, and many others – to address the need. These systems are, at their core, eventually consistent databases. They allow developers to make changes independently, and have robust (though admittedly domain-specific) conflict resolution mechanisms. In fact, they solve the divergence problem so well that they treat partitions as a normal case rather than an exception. Clearly, EGS’s characterization of such behavior as “lobotomized” (technically incorrect even in a medical sense BTW since the operation he’s clearly referring to is actually a corpus callosotomy) is off base since a lot of people at least as smart as he is derive significant value from it.

That example probably only resonates with programmers, though. Let’s find some others. How about the process of scientific knowledge exchange via journals and conferences? Researchers generate new data and results independently, then “commit” them to a common store. There’s even a conflict-resolution procedure, domain-specific just like the DVCS example but nonetheless demonstrably useful. This is definitely better than requiring that all people working on the same problem or dataset remain in constant communication or “degrade gracefully” by stopping work. That has never worked, and could never work, to facilitate scientific progress. An even more prosaic example might be the way police share information about a fleeing suspect’s location, or military units share similar information about targets and threats. Would you rather have possibly inconsistent/outdated information, or no information at all? Once you start thinking about how the real world works, eventual consistency pops up everywhere. It’s not some inferior cousin of strong consistency, some easy way out chosen only by lazy developers. It’s the way many important things work, and must work if they’re to work at all. It’s really strong/immediate consistency that’s an anomaly, existing only in a world where problems can be constrained to fit simplistic solutions. The lazy developers just throw locks around things, over-serialize, over-synchronize, and throw their hands in the air when there’s a partition.

Is non-eventual consistency useful? That might well be the more interesting question.

Changes Coming Soon

I’m going to be making a few changes around the site soon, so I figured I’d give people a bit of warning in case something goes wrong.

  • I’ll be moving. It’s not that I have any problem whatsoever with Rackspace. I must emphasize that they’ve been purely awesome while I’ve been here. The issue is purely one of location. My goal is to reduce my total work-to-home latency, and they’re not in the right place for that. By moving, I can make that latency half of what it is currently, and a third of what it was back when I did things the “recommended” way. I’ll write more some day about the configuration I’ve been using for the last few weeks. For now, it should suffice to say that everything I’ve done here I can do exactly the same way at the new place.
  • I’ll be changing how both blogs (pl.atyp.us and hekafs.org) get served. Between the two of them, this site overall has gone from barely detectable to a small blip. I have no pretensions of this being a truly big or important site, but even a blip needs to consider performance. I have a plan to continue using WordPress as a content-generation system, but I’ll actually be the only one using it directly. What everyone else will see is the result of a script that slurps all of the articles and category/month lists out of WordPress and converts them into (slightly optimized) static files that nginx can serve up by itself with maximum HTTP caching goodness. There will be no need for MySQL or PHP except when I’m adding new posts. There won’t be any need for varnish either, which I consider a good thing since it just screwed me last night by croaking for no reason whatsoever and leaving both blogs dead in the water.
  • As part of the static-page strategy, comments will have to change. Instead of using WordPress comments, I’ll switch to using an external system – probably Livefyre. Old comments will still be visible as part of the static pages, but new comments – including those on old posts – will go through a new system.
  • I’ll probably change the theme too, this time to something as minimal as I can find. No widgets. No sidebars. Just a modest header, a small menu bar, and the articles themselves.

If everything goes well, these changes will have only minimal effect on readers. Comments will look a little different, and load times will generally be faster, but it will still be the same guy writing about the same things in the same style. Stay tuned.

Playing with Scratch

Amy has been interested in creating her own video game for a while now, so I started looking for easy ways to get an eight-year-old started with programming. It quickly became clear that Scratch was the best choice, not only because of the environment itself but also because of the community that has built up around it, so I downloaded the Mac version and we started playing some of the demos. There’s one called Fruit RPG that seemed pretty close to the kind of game we wanted to create. You get up, walk around collecting fruit, then go into the “fruit place” where someone rewards you with a fruit platter – simple stuff, mostly there as an example of lists in Scratch. After Amy went to bed, I started digging in a bit so I could be several steps ahead of her and ready to answer questions when she asked instead of having to figure them out on the spot. My idea was to learn by adding a couple of new features – a poisonous mushroom that would kill you when you picked it up, and a dinosaur that would follow you around trying to steal your fruit (shades of Adventure there). I was able to implement both fairly quickly, but also discovered a couple of bugs.

  • The fruit were infinitely regenerating. Every time you went into a building and then came back out, all of the fruit were there even if you had already collected some.
  • If you walked to the spot on the screen where the fruit-place guy would be if you were in the store, the ending sequence would start even if you were outside.

I’m sure the people who wrote this little example didn’t care about the first, and probably the second either, since this was just an example of using lists anyway. Nonetheless, it seemed likely that I’d learn by fixing them. It turned out to be a bit more of an exercise than I’d thought, because the “scripts” to make things appear and disappear as you change location are intricately tied to the language’s message-passing paradigm. You leave your house by moving over an invisible door object, which as part of its operation broadcasts a message. The background object responds to this message by drawing the outdoor scenery, the fruit objects respond by making themselves visible, and so on.

This got me thinking about some aspects of the message-passing system. For example, does a “broadcast and wait” action mean that the message gets enqueued everywhere before it’s processed anywhere, or would various overlaps be possible? This was important because it would affect whether a second message sent from one of the response scripts was guaranteed to come after the original message, and the viability of some possible solutions hinged on the answer. I wondered how many young programmers had been frustrated by the weird semantics of the “forever when X” action (not “while X” but “if X” within an infinite loop), by having the wrong expectation of whether the “green flag” message is processed synchronously or asynchronously, by not realizing that multiple instances of the same script could run concurrently, and so on.

Needless to say, I fixed the bugs and added my features pretty easily. Along the way, I came up with a general set of “best practices” (essentially actor model) that avoid the pitfalls mentioned above. That’s all nice, but it’s not the real point here. The real point is that careful specification of a system’s behavior isn’t just something that you do for the sake of advanced programmers. Even a simple system like Scratch can involve significant complexity and ambiguity. Even the most inexperienced programmers trying to do the simplest things might get tripped up by details that you didn’t bother to explain. Scratch is a nice little system, but it might be nicer if it didn’t leave my daughter and other children like her trying to debug race conditions. Think about that next time you’re tempted to say that an implementation choice was obvious because users would be able to figure it out.

P.S. I wrote most of this a couple of days ago and wasn’t even sure I’d ever get around to posting it (I have dozens of drafts like that), but it seems particularly timely in light of this book review that was posted today so I might as well let it fly.

Be a Better Raindrop

no single raindrop believes it is to blame for the flood

The computing industry is already awash in condescension and negativity, and it’s getting worse. Yes, I know it’s not a new phenomenon, but I’ve been around long enough to be sure of the trend. I’ve been blogging for over a decade, I was on Usenet even longer than that before, and I was on other forums even before that. I know all about operating-system wars, language wars, editor wars, license wars, and their ilk. I’ve fought many of those wars myself. Still, things seem to be getting worse. Practically no technical news nowadays comes unaccompanied by a chorus of hatred from those who prefer alternatives. Half the time I find out about something that’s really pretty cool only because I see the bitching about it. How sad is that?

The thing is, it really doesn’t matter why people act this way. Yes, some people are just basically spiteful or insecure. Others might think they’re acting from more noble motives, such as bursting a hype bubble or squashing an idea they believe is truly dangerous. Half of the articles on this site are based in such motivations, so I’m by no means claiming innocence. The problem is that even the best-motivated snark still contributes to the generally unpleasant atmosphere. Contrary to popular belief, we techies are social animals. We have our own equivalent of the Overton Window. Every Linus eruption or similar event from a perceived leader shifts that window toward a higher spleen-to-brain ratio. Others emulate that example, and the phenomenon reinforces itself. Those of us who are older, who are leaders, who find ourselves quoted often, owe it to the community not to keep shifting that window in the wrong direction. That’s not being “honest” or “blunt” or “clear” either, if your honesty/bluntness/clarity is only apparent when your comments are negative. Real life is not one-sided. If your commentary is, then you’re not being any of those things. You’re just being part of the problem.

Linus not helping

No one of us caused this and no one of us can fix it. However, we can each try to do better. That’s my New Year’s resolution: to start taking the high road and giving people the benefit of the doubt just a bit more often. Sure, some people might get besotted with a particular idea or technology that I think is inferior, but that doesn’t make them stupid or bad. Some people might get carried away with their praise for a company or its products/people, but that doesn’t make them fanbois or shills. Some people are all of those things, and I’m sure I’ll still let slip the dogs of war from time to time when the occasion warrants it, but I’ll at least try to adopt a doctrine of no first strikes and proportional response instead of the ever escalating verbal violence that is now commonplace. Would anyone else like to give it a try?

Limiting Bash Script Run Time

Another self-explanatory bash hack. This one was developed to limit the total run time of a test script, where one of the commands was hanging but I was trying to chase down a different bug.

#!/bin/bash
 
# Run code with a time limit.  This is trickier than you'd think, because the alarm
# signal (or any other) won't be delivered to the parent until any foreground task
# completes.  That kind of defeats the purpose here, since a hung task will also
# block the signal we're using to un-hang it.  Fortunately, a directed "wait" gives
# us a way to work around this issue.  We start both the alarm task and the task
# that does real work in the background, then either way we get into exit_handler
# and kill whichever one's still running.  It's a little bit inconvenient that
# everything has to be wrapped in a "main" function to work, but there's a lot about
# bash that's unfortunate.
 
TIME_LIMIT=5
 
function exit_handler {
	if [ -n "$ALREADY_EXITING" ]; then
		return
	fi
	if [ -n "$WATCHER" ]; then
		echo "killing watcher"
		kill -s SIGKILL $WATCHER
	fi
	if [ -n "$WORKER" ]; then
		echo "killing worker"
		kill -s SIGKILL $WORKER
	fi
	echo "time to die"
}
trap exit_handler EXIT
 
function alrm_handler {
	echo "alarm went off"
	unset WATCHER
	exit
}
trap alrm_handler ALRM
 
export PARENT_PID=$$
(sleep $TIME_LIMIT; echo "ring ring"; kill -s SIGALRM $PARENT_PID) &
export WATCHER=$!
 
# Example function to demonstrate different completion sequences.
function main {
	if [ "$1" != 0 ]; then
		echo "sleeping"
		sleep $1
		echo "waking up"
	fi
}
 
main "$@" &
export WORKER=$!
wait $WORKER
unset WORKER
 
# Test with shorter sleep times to see the worker finish normally, with longer
# sleep times to see the watcher cut things short.