I am not so good in STATA. I will appreciate your help here. I want to know the number of observations in the dataset. Secondly, I want to delete the last observations. Say I have 100, observations, I want to delete the last 10 observations. Thirdly, I want to create a dummy which should take the form "Overweight, if BMI>25 and BMI<=30" which codes can I use?
To get the number of observations in a dataset you can use the command count
. That displays the number of observations in the dataset. In many cases you can use _N
to represent the number of observations programmatically in an expression.
You can use the command drop
in combination with in
to delete observation based on their sort order. drop in -10/l
(notice that the last character is lower case L) means that all observations between the 10th obs from the last obs until the last (l as in last) observations will be dropped/deleted.
A dummy only takes the value 1
, 0
or missing. You can use labels in Stata to have 1
represent some string like "Overweight". Anyways, if you have a numeric variable called BMI
then you can create your dummy like generate overweight = (BMI > 25 & BMI <= 30) if !missing(BMI)
. The if !missing(BMI)
part makes the overweight
dummy missing if BMI
is missing. Without this part it would have been 0
which is not the same as 0
means not overweight and missing means that we do not know if the person is overweight or not which is epistemologically very different.