Search code examples
error-handlingstatastata-macros

Is it possible to make Stata throw an error by default when a global macro is not defined, instead of a missing string?


A feature of Stata that is sometimes inconvenient is that calling a non-defined macro returns the missing value . [edit: Stata actually returns a missing string "", not a numerical missing value], instead of throwing an error. A piece of code, whose correct execution requires the definition of the macro, may just run giving incorrect results if the macro name is misspelled.

E.g.: having defined global $options = , vce(robust), when afterwards one writes reg y x $opt instead of reg y x $options the program runs anyway and it may be difficult to realise that the vce() option was not considered.

Is there any way to force Stata to issue an error in this case or is there some useful trick/best practice that can be used to reduce the risk of incurring this sort of mistake?


Solution

  • The feature is described incorrectly. A macro that is undefined is evaluated as an empty string, conventionally written "", i.e. the delimiters " " contain nothing, or -- if you prefer -- nothing is contained between them.

    A macro that is undefined is not ever evaluated as numeric system missing, written as a period . (call it dot or stop if you want).

    You would see system missing if the macro were set to contain something else that was system missing, which is entirely different. Saved results from programs, for example, might be system missing.

    One way to understand this is that macros in Stata contain strings, not numeric values; the fact that some macros have a numeric interpretation is something else. So, an undefined macro is evaluated as an empty string.

    Stata programmers learn to use this feature constructively as a way of allowing defaults when macros are undefined and other choices when they are defined.

    You are correct that the feature is a source of bugs, as when a spelling mistake leads Stata to see a name that isn't defined and just ignores the reference. The bug is still the programmer's bug, not Stata's.

    So, what can you do, apart from check your code as usual? You can always check whether a macro is defined, as in

    if "$options" == "" { 
        * do something 
    } 
    else {
        * do something else 
    } 
    

    Conversely,

    if "$options" != "" 
    

    is a test for content.

    Alternatively, you could use string scalars. Here is an experiment:

    . sysuse auto, clear
    (1978 Automobile Data)
    
    . scalar foo = ", meanonly"
    
    . summarize mpg `=scalar(foo)'
    
    . ret li
    
    scalars:
                      r(N) =  74
                  r(sum_w) =  74
                    r(sum) =  1576
                   r(mean) =  21.2972972972973
                    r(min) =  12
                    r(max) =  41
    
    . summarize mpg `=scalar(bar)'
    bar not found
    
        Variable |        Obs        Mean    Std. Dev.       Min        Max
    -------------+---------------------------------------------------------
             mpg |         74     21.2973    5.785503         12         41
    

    In this case, there was an error message when an undefined scalar was referred to, but the command was executed any way.

    Personally, as a long-term (1991- ) and high intensity Stata user, I just use macros routinely and regard being occasionally bitten by bugs of this kind as a very small price to pay for that. I have not ever used string scalars in this sense before trying to answer this question.

    It's a different argument, but I regard using global macros in this way as poor programming style. There are general arguments across programming for minimizing the use of globally declared entities. Local macros are the beasts of choice.