Like just about any geek, I think a lot about what features the ideal programming language would have. Several of my ideas have to do with scalar data types. I know I’ve written in the past about my distaste for mandatory type checking but, even without that, being able to define types that can be distinguished from one another when you want to is very useful. I have long considered it one of C’s biggest warts that every enum is effectively the same type, for example. Having to fix a bug because two function arguments of different enum types but the same underlying language type were transposed is annoying. This is why it’s often a good idea to give each enum an explicit and distinct starting value. I find that when I use an enum I almost always have a switch statement somewhere with cases for every known value plus a default case that will spit out an “unrecognized value” error if it ever executes. If each enum occupies a distinct range, such code will catch this kind of type error for you very quickly.

Another area where I’ve always felt C/C++/Java were weak was in the definition of related constants. At least C/C++ have enum to solve the fundamental problem of having the compiler instead of the programmer ensure uniqueness of values. Java, amazingly, managed to take a step back even from that, forcing people to use a bunch of static final variables instead (static final itself being a clumsy way to say “constant”). Nowadays the official “Java Way” is to define a whole class for each enumerated type, which is not only inefficient but also even easier to screw up; now you have to worry not only about screwing up the values but also about the methods as well.

Enum solves a problem for the most common case where the only constraint is that the values within it must be unique, but there are other possible extensions based on other constraints. I and many others have often wished for a “binary enum” to define flag values, for example: instead of assigning values like 1, 2, 3, 4 it would assign 0×01, 0×02, 0×04, 0×08. The constraint here is that the values not only be unique but have unique bits turned on, making it practical to apply bitwise operations to such a variable (partially replacing the sets whose lack I bemoaned earlier). A further extension of this idea, which you might have been wondering about if you read the title, would be a “tree enum” to represent hierarchical sets of values. For example, look over on the right side of this page and you’ll see a list of article categories. Some of them are nested within one another; for example, if something’s in the lower-level category “design” or “networks” it is implicitly in the higher-level category “tech” as well. If you wanted to enable a simple test for whether something was within a category, what would you do? Let’s take a look at a simpler example based on animal taxonomy:

#define LIZARD 0x01
#define SNAKE 0x02
#define TURTLE 0x04
#define REPTILE 0x07
#define PLATYPUS 0x08
#define ECHIDNA 0x10
#define MONOTREME 0x18
myAnimal = SNAKE;
if (myAnimal & REPTILE)
    // whatever

All well and good, but what if you want to add a tuatara which is another kind of reptile? Now you not only have to add a value (let’s say 0×20) but you also have to recalculate REPTILE (to 0×27). If you want things to be all neat and orderly you’d insert TUATARA after TURTLE and recalculate/renumber everything after that. The problem is that any time you require the programmer to update two or more values and maintain some sort of relationship between them you create a significant potential for them to screw it up. Wouldn’t it be easier if you could just do this instead?

treenum {

The compiler could then generate all of the necessary constants with the necessary relationship between them for you. If you need to add TUATARA you only need to do it in one place. Note also that set operations make sense, though perhaps not in the animal example. Going back to the post-category example, I pretty frequently put a post in multiple categories, so the following all works:

postCat = HUMOR | WORK;
if (postCat & HUMOR)
    // test succeeds
if (postCat & TECH)
    // test succeeds because WORK is included in TECH
if (postCat & INTERNET)
    // test fails