Search code examples
statadummy-variable

Generate dummies in Stata vs. in R


Stata

r, u, s are dummies. I'm wondering if the following line is also generating dummy n, if r or u or s ==1, but just omit ==1 after r, u, s?

generate byte n = r | u | s

R

Does it make a difference when we generate byte and variable in R or it's the same in R?


Solution

  • This answer addresses Stata questions only.

    In Stata if r u s are all 0, 1 variables then r | u | s is also 0, 1 and will be 1 if any of those is 1 and 0 if and only if all are 0. So, it is equivalent to max(r, u, s).

    But watch out if r u s are 0, 1 or missing, then r | u | s will also be 1 if any of those is missing. But max(r, u, s) will be missing only if all of those are missing.

    If missings are present, then you could use

      * 1 
      gen n = r | u | s if !missing(r, u, s) 
    

    The result will be 1 if any argument r u s is 1, 0 if all arguments are 0, and missing if any argument is missing.

      * 2 
      gen n = (r == 1) | (u == 1) | (s == 1) 
    

    The result will be 1 if any argument is 1 and 0 otherwise. "Otherwise" is anything from all 0s to all missings.

      * 3 
      gen n = inlist(1, r, u, s) 
    

    #3 is equivalent to #2.

    In all cases, specifying byte is good practice to save on storage, but not material otherwise.