I am trying to write a Stata program that does some calculation by an identifier and I wish to make it so that the identifier can be either string or integer.
A grossly simplified version of what I am trying to do is like this:
clear all
***** test data
input str10 id1 id2 x y
a 1 20 40
a 1 140 20
a 1 0 70
b 2 50 25
b 2 25 50
b 2 40 42
end
*****
capture program drop myprog
program define myprog
version 14.2
syntax using, ID(varname) Mean(varname)
tempname postname
quietly levelsof `id', local(ids)
local idtype: type `id'
postfile `postname' `idtype' `id' `mean' `using', replace
foreach i of local ids {
quietly summarize `mean' if `id'==`i'
post `postname' (`i') (`r(mean)')
}
postclose `postname'
end
And I expect both of the following to work:
myprog using "test1.dta", id(id1) mean(x)
myprog using "test2.dta", id(id2) mean(x)
Any advice?
Just use an if
/ else
statement to distinguish between the two cases:
capture program drop myprog
program define myprog
version 14.2
syntax using, ID(varname) Mean(varname)
tempname postname
quietly levelsof `id', local(ids)
local idtype: type `id'
postfile `postname' `idtype' `id' `mean' `using', replace
if substr("`idtype'" , 1, 3) == "str" {
foreach i of local ids {
summarize `mean' if `id'=="`i'", meanonly
post `postname' ("`i'") (`r(mean)')
}
}
else {
foreach i of local ids {
summarize `mean' if `id'==`i', meanonly
post `postname' (`i') (`r(mean)')
}
}
postclose `postname'
end
Incidentally, note the use of the meanonly
option of summarize
.