How to check that a function always return a value (aka "doesn't fall off the end")?

I'm building a didactic compiler, and I'd like to check if the function will always return a value. I intend to do this in the semantic analysis step (as this is not covered by the language grammar).

Out of all the flow control statements, this didactic language only has if, else, and while statements (so no do while, for, switch cases, etc). Note that else if is also possible. The following are all valid example snippets:

if (condition) {
    // non-returning commands
}
return value

if (condition) {
    return value
}
return anotherValue

if (condition) {
    return value1
} else {
    return value2
}
// No return value needed here

I've searched a lot about this but couldn't find a pseudoalgorithm that I could comprehend. I've searched for software path testing, the white box testing, and also other related Stack Overflow questions like this and this.

I've heard that this can be solved using graphs, and also using a stack, but I have no idea how to implement those strategies.

Any help with pseudocode would be very helpful!

(and if it matters, I'm implementing my compiler in Swift)

Solution

If you have a control flow graph, checking that a function always returns is as easy as checking that the implicit return at the end of the function is unreachable. So since there are plenty of analyses and optimizations where you'll want a CFG, it would not be a bad idea to construct one.

That said, even without a control flow graph, this property is pretty straight forward to check assuming some common restrictions (specifically that you're okay with something like if(cond) return x; if(!cond) return y; being seen as falling of the end even though it's equivalent to if(cond) return x; else return y;, which would be allowed). I also assume there's no goto because you didn't list it in your list of control flow statements (I make no assumptions about break and continue because those only appear within loops and loops don't matter).

We just need to consider the cases of what a legal block (i.e. one that always reaches a return) would look like:

So an empty block would clearly not be allowed because it can't reach a return if it's empty. A block that directly (i.e. not inside an if or loop) contains a return would be allowed (and if it isn't at the end of the block, everything after the return in the block would be unreachable, which you might also want to turn into an error or warning).

Loops don't matter. That is, if your block contains a loop, it still has to have a return outside of the loop even if the loop contains a return because the loop condition may be false, so there's no need for us to even check what's inside the loop. This wouldn't be true for do-while loops, but you don't have those.

If the block directly contains an if with an else and both the then-block and the else-block always reach a return, this block also always reaches a return. In that case, everything after the if-else is unreachable. Otherwise the if doesn't matter just like loops.

So in pseudo code that would be:

alwaysReturns( {} ) = false
alwaysReturns( {return exp; ...rest} ) = true
alwaysReturns( { if(exp) thenBlock else elseBlock; ...rest}) =
    (alwaysReturns(thenBlock) && alwaysReturns(elseBlock)) || alwaysReturns(rest)
alwaysReturns( {otherStatement; ...rest} ) = alwaysReturns(rest)