Search code examples
cbreakgoto

break vs goto: Which is more appropriate in the following bool function in C?


I wrote the following function which is meant to check if there is any non digit char in a string. The function must stop right after finding any non digit char and exit the loop and return true. Else, it will return false. Here's the code I am using: (ctype.h library is included).

bool isnotdigit(string argv)
{
    bool y = false;
    for (int i = 0, n = strlen(argv); i < n; i++)
    {
        char c = argv[i];
        if (! isdigit(c))
        {
            y = true;
            break;
        }
    }
    return y;
}

It can also be done this way:

bool isnotdigit(string argv)
{
    bool y = false;
    for (int i = 0, n = strlen(argv); i < n; i++)
    {
        char c = argv[i];
        if (! isdigit(c))
        {
            y = true;
            goto next;
        }
    }
next:
    return y;
}

If I am not wrong, both codes work the same way. Right? Then, what is the pros and cons of both? Especially in case of the bool function above.


Solution

  • This is a programming style question, and as such, you are probably not going to get a definitive answer.

    Some say that break is just a goto in disguise, so that one is as bad as the other. Some people say you should never use break.

    But the whole point of break is that it's a guaranteed-to-be-unconfusing goto. It always goes from the inside of a loop, to the outside of a loop. (Well, except when it's part of a switch statement, of course.)

    In this case, the goto buys you little or nothing. So I don't think anyone would say that your second example is much better than your first. (When there are nested loops, things can get more complicated. More on that in a bit.)

    And, finally, yet another way of writing your function is like this:

    bool isnotdigit(string argv)
    {
        for (int i = 0, n = strlen(argv); i < n; i++)
        {
            char c = argv[i];
            if (! isdigit(c))
            {
                return false;
            }
        }
        return true;
    }
    

    But some people say that this is bad style, because (they say) a function should have exactly one return statement. Other people, however, say that this is good style, because it gets rid of the extra boolean variable y, and getting rid of extra variables (especially extra little Boolean variables that just keep track of things) is a good rule, too.

    Here are my opinions:

    • goto can be confusing, and you almost never need it. I almost never use goto.
    • break is fine. It's almost never confusing. I use it all the time. So I would very much prefer your first fragment over your second.
    • Extra boolean variables, that just keep track of little things, can be confusing, especially when the only reason they're there is to get around some other "rule".
    • I have no problem with multiple return statements, so I would always use something more like the third alternative, as I've presented it in this answer.

    See also this older question.

    Now, in fairness, the argument against multiple return statements can be a good one. If you've got cleanup to do before you exit, or if you ever have to add some cleanup code in the future, it can be very easy to forget to add it in both places (or in three or more places) if there are multiple return statements. For me, when the function is small and simple (as it is in this case), I think the cleanliness (and the loss of the extra Boolean variable) outweighs the risks. But if your metastyle is that you don't like judgment calls, if you like rigid rules that you can apply everywhere to avoid the superficial risks no matter what, then the "no multiple return statements" rule makes sense.

    Finally, there's the question of nested loops. The only justification I can imagine for the goto next; usage in your second example is that if a later programmer comes along and adds a nested loop, the code with break would probably not work any more, and would have to be reworked somehow. By using goto, the rationalization might go, the code is more robust against that possibility. (Personally, I don't think that's a good rationalization for the goto, but as I say, it's the only one I can think of.)

    Once you have nested loops, the pros and cons (that is, goto versus break versus multiple return statements) definitely shift around, as discussed in the answers to that other question.

    P.S. Some languages "solve" the break-out-of-nested-loops problem by having something like break(2) that can break out of multiple loops at once. C does not have this, and the reason is that it was felt to be too potentially confusing, and too likely to, um, break if a later maintenance programmer added or removed a level of nesting.

    Or in other words, although single-level-break is a guaranteed-to-be-unconfusing goto, multi-level-break is potentially just as confusing as goto. And, of course, you can argue with my characterizations here: after all, who says that break is guaranteed to be unconfusing? It's not strictly guaranteed, of course, but that's because any language feature can be confusing if you use it badly. (Case in point: extra little boolean variables that just keep track of various things.)