Search code examples
loopsconditional-statementsstata

loop with math operations


I need to perform a series of replacement values and function of other variables already defined. I tried to write the logic of what I need to do.   The var1 will be the "conditional"

if var1> = 1 & var1 <= 2 {
new_variable1 = var3 * 100000000 + var4.
new_variable2 = var5 * 1000}

else {
new_variable1 = var3 * 1000000 + 99 * 100 + var5 * 1000
new_variable2 = var3 * 1000000 + var5 * 10000 + var4
}

Example sample:

var1    var2    var3    var4    var5    
1101    1   10  3   20
1102    2   15  2   15
1103    1   12  2   15
1103    2   20  3   12
1102    3   10  1   10
1104    2   15  1   10

Solution

  • clear 
    input var1    var2    var3    var4    var5    
    1101    1   10  3   20
    1102    2   15  2   15
    1103    1   12  2   15
    1103    2   20  3   12
    1102    3   10  1   10
    1104    2   15  1   10
    end 
    
    gen long new1 = cond(inrange(var2, 1, 2), var3 * 1e8 + var4, var3 * 1e6 + 9900 + var5 * 1000) 
    gen long new2 = cond(inrange(var2, 1, 2), var5 * 1000, var3 * 1e6 + var5 * 10000 + var4) 
    
         +----------------------------------------------------------+
         | var1   var2   var3   var4   var5         new1       new2 |
         |----------------------------------------------------------|
      1. | 1101      1     10      3     20   1000000003      20000 |
      2. | 1102      2     15      2     15   1500000002      15000 |
      3. | 1103      1     12      2     15   1200000002      15000 |
      4. | 1103      2     20      3     12   2000000003      12000 |
      5. | 1102      3     10      1     10     10019900   10100001 |
      6. | 1104      2     15      1     10   1500000001      10000 |
         +----------------------------------------------------------+
    

    Thanks for the example. The values of var1 all fall way outside the interval [1, 2] so I have recast the example in terms of var2.

    The most important detail is that the if command is quite wrong here, as it doesn't imply a loop over observations. You could rewrite the code using the if qualifier but it is simpler to use cond() which is an if/else construct. For manipulations like this you shouldn't rely on the default float storage type.

    Reading list:

    help cond()
    
    help inrange() 
    

    https://www.stata.com/support/faqs/programming/if-command-versus-if-qualifier/

    https://www.stata-journal.com/sjpdf.html?articlenum=pr0016

    https://www.stata-journal.com/sjpdf.html?articlenum=dm0026

    Note that this isn't a problem needing a loop in Stata.