Search code examples
statapostfile

How to make postfile work with both string and numeric variable


I am trying to write a Stata program that does some calculation by an identifier and I wish to make it so that the identifier can be either string or integer.

A grossly simplified version of what I am trying to do is like this:

clear all

***** test data
input str10 id1 id2 x y
a   1   20  40
a   1   140 20
a   1   0   70
b   2   50  25
b   2   25  50
b   2   40  42
end

*****
capture program drop myprog
program define myprog
    version 14.2
    syntax using, ID(varname) Mean(varname)
    tempname postname

    quietly levelsof `id', local(ids)
    local idtype: type `id'

    postfile `postname' `idtype' `id' `mean' `using', replace


    foreach i of local ids {
        quietly summarize `mean' if `id'==`i'
        post `postname' (`i') (`r(mean)')
    }

    postclose `postname'
end

And I expect both of the following to work:

myprog using "test1.dta", id(id1) mean(x)
myprog using "test2.dta", id(id2) mean(x)

Any advice?


Solution

  • Just use an if / else statement to distinguish between the two cases:

    capture program drop myprog
    program define myprog
        version 14.2
        syntax using, ID(varname) Mean(varname)
        tempname postname
    
        quietly levelsof `id', local(ids)
        local idtype: type `id'
    
        postfile `postname' `idtype' `id' `mean' `using', replace
    
        if substr("`idtype'" , 1, 3) == "str" {
            foreach i of local ids {
                summarize `mean' if `id'=="`i'", meanonly 
                post `postname' ("`i'") (`r(mean)')
            }
        } 
        else {
            foreach i of local ids { 
                summarize `mean' if `id'==`i', meanonly 
                post `postname' (`i') (`r(mean)')       
            }
        }
    
        postclose `postname'
    end
    

    Incidentally, note the use of the meanonly option of summarize.