Most of my work is on code that has “initialize all local variables at declaration time” as part of the coding standard. I’ve never been a big fan, but I’m very reluctant to get into coding-standard arguments (probably as the result of having had to enforce them for so long) so I just let it go. The other day, Rusty Russell offered up a better reason to avoid this particular standard. The crux of the matter is that there’s a difference between a value being initialized vs. it being initialized correctly, and the difference is too subtle to define a usable standard. Sometimes there is a reasonable default value, and you want to initialize to that value instead of setting it in ten different places. Other times every value has a distinct important meaning, and code depends on a variable having one of those instead of a bland default. Does NULL mean “unassigned” or “no such entry” or “allocate for me” or something else? The worst part of all this is that required initializers prevent compilers and static-analysis tools from finding real uninitalized-variable errors for you. As far as they’re concerned it was initialized; they don’t know that the initial value, if left alone, will cause other parts of your program to blow up. If you need a real value, what you really want to do is leave the variable uninitialized at declaration time, and let compilers etc. do what they’re good at to find any cases where it’s used without being set to a real value first. If your coding standard precludes this, your coding standard is hurting code quality.
Rusty suggests that new languages should be designed with a built-in concept of undefined variables. At the very least, each type should have a value that can not be set, and that only the interpreter/compiler can check. This last part is important, because otherwise people will use it to mean NULL, with all of the previously-mentioned ambiguity that entails. The “uninitialized” value for each type should mean only that – never “ignored” or “doesn’t matter” or anything else. A slightly better approach is to make “uninitialized” only one of many variable annotations that are possible, as in cqual. Maybe some of that functionality will even be baked into gcc or LLVM (small pieces already are), providing the same functionality in current languages. Until then, the best option is to educate people about why it can sometimes be good to leave variables uninitialized until you have a real value for them.
“Until then, the best option is to educate people about why it can sometimes be good to leave variables uninitialized.”.
Nonsense. You’re introducing massive intellectual burden for almost zero return.
Part of the benefit of coding standards is to minimize the amount of careful thought required for simple coding tasks and to create uniformity and a reliable meta-language invariant.
“educating” people and letting them make case-specific decisions for “better” code is completely wrong.
What part of “sometimes” are you having trouble with? I’m not saying you should *never* initialize locals. In general you should, but it shouldn’t be rigidly required by the coding standard because there are valid exceptions to the rule. Letting broken code compile when the error could have been detected and flagged automatically if not for a bogus standard is counter-productive. It’s far more of an intellectual burden on the person who has to debug it than the occasional relaxation of the standard would be; if you had read what Rusty or I wrote before jerking your knee, you’d get that.
BTW, when you introduce yourself to someone with “nonsense” you come across as a total ass. Don’t try to defend *that* or I’ll just delete your comments.
Many decades ago, someone ranted to me about the virtues of one computer, that had the number negative zero. Positive zero was your normal zero, but negative zero could only be assigned, not computed.
Your post made me thing of that. I have no idea what kind of hardware it was, I think this was in the mid-late 1980s.
This kind of rule seems like something a sophomore programmer would latch onto — “You should initialize variables!” — without thinking through *why* you should initialize them, or *when* you should initialize them.
The problem isn’t the rule that “all local variables should be initialised”, the problem is treating a coding standard as a rigid set of rules that *must* be followed, instead of a set of guidelines designed to help you avoid common mistakes and to make the code base easier to understand (since it uses a common dialect of the programming language, rather than reflecting the idiosyncratic preferences of each individual developer).
That’s why the second section of Python’s PEP 8 is dedicated to the fact that it’s sometime appropriate to deviate from the coding standard (http://www.python.org/dev/peps/pep-0008/).
@Ronald: all modern PCs can both compute and assign negative zero – it’s part of the IEEE754 floating point standard.
This also breaks down a bit when it comes to languages that encourage immutable variables. This is true for a lot of functional programming languages, including Scala, OCaml, and Clojure. Such variables act like runtime constants, and it’s somewhere between difficult and non-idiomatic to always declare them before they are actually used.
Dead on. Perl has this concept, and it is as powerful as you indicate. You can create an undefined variable with
my $var;
and then the test for undefined-ness is
if ($var) {
# do something for an undefined state
…
}
This *should be* different than NULL, though perl does make the conditional work the same for nullity as it does for undefined-ness. Or the more modern language does. The older version (curiously the ones in RHEL 5.x and before) also have a defined() function to test explicitly for a defined state, though this has been deprecated in the 5.10.x and beyond. One can also return a variable to an undefined state using the undef() function.
I am not trying to sell anyone on perl here. Just pointing out that within the language, this is “reasonably” well implemented (for some values of reasonable). In my opinion, having used this a bit, I look at the undefined state as something different than the value of NULL or 0 or the other things that seem to be aliased together.
Undefined is a useful concept for much of the code we develop. You have to be more careful in C and similar compiled languages. I don’t necessarily agree that “more careful” is better, if “more careful” is due to language constraints/semantics as compared to ease and accuracy in expressing an algorithm.
I think the best advice I ever saw on this (for C++) was simple – don’t declare variables until you have knowledge of what the initial value should be, and then declare and initialize them at that point. I realize this doesn’t work for all languages (C being one), but where it’s possible, it seems like the best path to me.
Ronald,
Nearly every modern machine has a +0 and -0, it’s part of the IEEE 754 floating point standard. Either value can be assigned to, or can result from rounding towards 0 from either direction.
Jeff,
I agree that both you should generally initialize at declaration, and that you need to allow exceptions. After all, all standards have exceptions, except the for the standards that don’t have exceptions :)
Regarding IEEE Floating point, negative 0 has a real purpose, for representing the result of an underflow from negative values as distinct from an underflow from positive values.
For detecting uninitialized variables, IEEE floating point has the “NaN”s, or “not a number” values. There are bit patterns which are required to trap when referenced (the trapping NaNs) and others which merely propagate through calculations, to show that the result is trash. NaNs should always be used as the default initialization. They are amazingly useful at turning up bugs in scientific code.