I would like to create a local macro for a subset of my dataset to use for future regressions (see Some Uses for Macros Outside of Loops section).
I've started off with code that is along the following lines:
quietly reg y x1 x2 x3
local subset if e(sample)
list Unit `subset'
reg y x1 x2 if `subset'
x3
has missing values, so some observations are excluded in the first reg
command. The output from the list
command does indicate that contents of the macro are indeed what I want (Unit
is a variable that identifies the observation).
Nevertheless, I receive an error message after the last command:
if not found
r(111);
From the information on r(111)
:
__________ not found;
no variables defined;
The variable does not exist. You may have mistyped the variable's name.
What is wrong with my syntax? That is, why is Stata treating if
as a variable?
Given your definition the text if
is part of the macro contents.
quietly reg y x1 x2 x3
local subset if e(sample)
list Unit `subset'
reg y x1 x2 if `subset'
So the list
command works because it is interpreted as
list Unit if e(sample)
but the regress
command is not working because it is interpreted as
regress y x1 x2 if if e(sample)
and Stata is puzzled out of its mind by the second if
.
That's a comparatively minor deal. The bigger deal is that absolutely all you are doing is putting the text if e(sample)
into the local macro subset
and saving yourself a few characters in typing. That is fragile because, come the next estimation command, with possibly a different estimation sample, the local macro won't have the same implication. There is a better way to keep track securely of the estimation sample, which is to create an indicator immediately after model estimation by e.g.
gen byte regsample = e(sample)
and then if regsample
is guaranteed to select precisely the same subset (including all the observations whenever they were all used).