While I was reviewing a patch yesterday, I found code that used lots of distinct directory names for a series of tests – test1 would use brick1 and brick2, test2 would use brick3 and brick4, etc. I’ve run into this pattern myself, and it can be a bit of a maintenance problem as tests are added or removed. For example, in the test scripts for iwhd, there were multiple occasions when adding a test led to accidental reuse of names, and much non-hilarity ensued (everything about that project was non-hilarious but that’s a story for another time). The simplest pattern to deal with this is something like the following, which I suggested in a review comment:

sequence=0
...
# Test 1
sequence=$((sequence+1))
srcdir=/foo/bar$sequence
sequence=$((sequence+1))
dstdir=/foo/bar$sequence
...
# Test 2
sequence=$((sequence+1))
thedir=/foo/bar$sequence

This works pretty well, but the inline manipulation of $sequence kind of bugged me so I tried to put it in a function. My first try looked something like this.

sequence=0
 
function next_value {
    sequence=$((sequence+1))
    echo $sequence
}
 
thedir=/foo/bar/$(next_value)

Yeah, I hear the laughter. For those who didn’t get the joke yet, this falls prey to bash’s handling of variable scope and subshells. The $(next_value) construct ends up getting executed in a subshell, so changes it makes to variables aren’t reflected in the parent and you end up with the same value every time. I really should have stopped there, satisfying myself with the original inline version. Sure, that version can still hit the scope/subshell issue, but only if you use functions in your own code and not as a side effect of the idiom itself. I realized that getting around the scope/subshell issue would involve something ugly and inefficient, which is why I should have stopped, but I was intrigued. Surely, I thought, there should be a way to do this in an encapsulated and yet robust way. The first idea was to stick the persistent context in a temporary file.

tmpfile=$(my_secure_tmpfile_generator)
 
function next_value {
    prev_value=$(cat $tmpfile)
    next_value=$((prev_value+1))
    echo $next_value > $tmpfile
    echo $next_value
}

OK, it’s kind of icky, but it should work. Again, I should have stopped there, but that temporary file bothered me. Surely I could do that without the file I/O, perhaps by spawning a subprocess and talking to that through a pipe. Yes, folks, I had embarked on a quest to find the most insanely complicated way to solve a pretty simple problem. The result is generator.sh and here’s an example of how to use it.

source generator.sh
start_generator int_generator 5 6
...
dir1=/foo/bar$(next_value 5 6)
dir2=/foo/bar$(next_value 5 6)

Doesn’t look too bad, does it? OK, now go ahead and look at how it’s done. I dare you. Here are some of the funnier bits.

# start_generator
ctop=$(mktemp -t -u fifoXXXXXX)
mkfifo $ctop || fubar=1

Yes, really. Not polluting the filesystem with a temporary file was part of the point here, but I ended up dropping not one but two orts instead. (Cool word, and yes, I did use a thesaurus.) To be fair, these are only visible in the filesystem momentarily before they’re opened and then deleted, but still. I tried to find a way to do this with anonymous pipes, but there just didn’t quite seem to be a way to get bash to do that right. Here’s the next fun bit.

# start_generator
$1 < $ptoc >$ctop &
eval "exec $2> $ptoc"
eval "exec $3< $ctop"

The first line invokes the subprocess, with input and output fifos. The two execs are the bash way to create read and write file descriptors for a file. They’re wrapped in evals to satisfy my goal of making things as complicated as possible by allowing the caller to specify both the generator function/program and the file descriptors to use. Eval is very evil, of course, so let’s play Spot The Security Flaw.

start_generator int_generator "do_something_evil;"
# ...causes us to eval...
exec do_something_evil;> $ptoc

I'm not going to fix this, because it's only an "insider" threat. This code already runs with the same privilege as the caller, and can't do anything the caller can't. They could also pass in a totally bogus generator function, and I'm not going to worry about that either because they'd only be shooting themselves. On to the next fun piece.

# next_value
echo more 1>&$1
read -u $2 x

Again, this is kind of standard bash stuff to write and then read from specific file descriptors. Having an example of this is one of the main reasons I didn't just throw away the script. With a little bit of tweaking, the same technique could be used as the basis for a general form of IPC to/from a subprocess, and that might be useful some day.

To reiterate: this is some of the craziest code I've ever written. It's way more complicated than other solutions that better satisfy any likely set of requirements, and the implementation threads its way through some particularly perilous bash minefields. FFS, I might as well have just used mktemp in the first place and skipped all of this. You'd have to be nuts to solve this problem this way, but maybe my documentation of the discoveries I made along the way will help someone solve a similar problem. Or maybe it's just a funny story about bash scripting gone horribly wrong.