Why do some kernel programmers use goto instead of simple while loops?
Historical context: We should remember that Dijkstra wrote Goto Considered Harmful in 1968, when a lot of programmers used goto
as a replacement for structured programming (if
, while
, for
, etc.).
It's 44 years later, and it's rare to find this use of goto
in the wild. Structured programming has already won, long ago.
Case analysis:
The example code looks like this:
SETUP...
again:
COMPUTE SOME VALUES...
if (cmpxchg64(ptr, old_val, val) != old_val)
goto again;
The structured version looks like this:
SETUP...
do {
COMPUTE SOME VALUES...
} while (cmpxchg64(ptr, old_val, val) != old_val);
When I look at the structured version, I immediately think, "it's a loop". When I look at the goto
version, I think of it as a straight line with a "try again" case at the end.
The goto
version has both SETUP
and COMPUTE SOME VALUES
on the same column, which emphasizes that most of the time, control flow passes through both. The structured version puts SETUP
and COMPUTE SOME VALUES
on different columns, which emphasizes that control may pass through them differently.
The question here is what kind of emphasis do you want to put in the code? You can compare this with goto
for error handling:
Structured version:
if (do_something() != ERR) {
if (do_something2() != ERR) {
if (do_something3() != ERR) {
if (do_something4() != ERR) {
...
Goto version:
if (do_something() == ERR) // Straight line
goto error; // |
if (do_something2() == ERR) // |
goto error; // |
if (do_something3() == ERR) // |
goto error; // V
if (do_something4() == ERR) // emphasizes normal control flow
goto error;
The code generated is basically the same, so we can think of it as a typographical concern, like indentation.
Very good question, and I think only the author(s) can provide a definitive answer. I'll add my bit of speculation by saying that it could have started with using it for error handling, as explained by @Izkata and the gates were then open to using it for basic loops as well.
The error handling usage is a legitimate one in systems programming, in my opinion. A function gradually allocates memory as it progresses, and if an error is encountered, it will goto
the appropriate label to free the resources in reverse order from that point.
So, if the error occurs after the first allocation it will jump to the last error label, to free only one resource. Likewise, if the error occurs after the last allocation, it will jump to the first error label and run from there, freeing all the resources until the end of the function. Such pattern for error handling still needs to be used carefully, especially when modifying code, valgrind and unit tests are highly recommended. But it is arguably more readable and maintainable than the alternative approaches.
One golden rule of using goto
is to avoid the so-called spaghetti code. Try drawing lines between each goto
statement and its respective label. If you have lines crossing, well, you've crossed a line :). Such usage of goto
is very hard to read and a common source of hard to track bugs, as they would be found in languages like BASIC that relied on it for flow control.
If you do just one simple loop you won't get crossing lines, so it's still readable and maintainable, becoming largely a matter of style. That said, since they can just as easily be done with the language provided loop keywords, as you've indicated in your question, my recommendation would still be to avoid using goto
for loops, simply because the for
, do/while
or while
constructs are more elegant by design.
In the case of this example, I suspect it was about retrofitting SMP support into code that was originally written in a non-SMP-safe way. Adding a goto again;
path is a lot simpler and less invasive than restructuring the function.
I can't say I like this style much, but I also think it's misguided to avoid goto
for ideological reasons. One special case of goto
usage (different from this example) is where goto
is only used to move forward in a function, never backwards. This class of usages never results in loop constructs arising out of goto
, and it's almost always the simplest, clearest way to implement the needed behavior (which is usually cleaning up and returning on error).