In my Stata do
scripts, I often have to compare dates which may be missing. Unfortunately, the internal representation of .
is the largest possible number of the given range, so the following holds:
5 < .
This can become quite annoying e.g. when checking whether a date is within a certain range:
gen between_start_stop = . if d == .
replace between_start_stop = 1 if ///
!missing(d) & !missing(start) & !missing(stop) & ///
start < d & d < stop
replace between_start_stop = 0 if ///
((!missing(d) & !missing(start) & !(start < d)) | ///
(!missing(d) & !missing(stop) & !(d < stop))
instead of the following:
gen between_start_stop = (start < d) & (d < stop)
Is there a way to use comparison operators that work with ternary logic?
I.e., I would like the following statements to be true:
(5 < .) == .
(. < .) == .
(. < 5) == .
(. & 1) == .
(. & 0) == 0
etc...
A couple of suggestions:
inrange()
(also look at inlist) to specify ranges instead of a series of <
and >
statements; missing()
or !missing()
statements like !missing(start, stop, d)
and it really sounds like you want to use cond()
, which (using an ex from the help file) can be used to specify multiple conditions in one function:
g var = 1 if cond(missing(x), ., cond(x>2,50,70))
returns .
if x
is missing, returns 50
if x > 2
, and returns 70
if x < 2