I need to perform a series of replacement values and function of other variables already defined. I tried to write the logic of what I need to do.
The var1
will be the "conditional"
if var1> = 1 & var1 <= 2 {
new_variable1 = var3 * 100000000 + var4.
new_variable2 = var5 * 1000}
else {
new_variable1 = var3 * 1000000 + 99 * 100 + var5 * 1000
new_variable2 = var3 * 1000000 + var5 * 10000 + var4
}
Example sample:
var1 var2 var3 var4 var5
1101 1 10 3 20
1102 2 15 2 15
1103 1 12 2 15
1103 2 20 3 12
1102 3 10 1 10
1104 2 15 1 10
clear
input var1 var2 var3 var4 var5
1101 1 10 3 20
1102 2 15 2 15
1103 1 12 2 15
1103 2 20 3 12
1102 3 10 1 10
1104 2 15 1 10
end
gen long new1 = cond(inrange(var2, 1, 2), var3 * 1e8 + var4, var3 * 1e6 + 9900 + var5 * 1000)
gen long new2 = cond(inrange(var2, 1, 2), var5 * 1000, var3 * 1e6 + var5 * 10000 + var4)
+----------------------------------------------------------+
| var1 var2 var3 var4 var5 new1 new2 |
|----------------------------------------------------------|
1. | 1101 1 10 3 20 1000000003 20000 |
2. | 1102 2 15 2 15 1500000002 15000 |
3. | 1103 1 12 2 15 1200000002 15000 |
4. | 1103 2 20 3 12 2000000003 12000 |
5. | 1102 3 10 1 10 10019900 10100001 |
6. | 1104 2 15 1 10 1500000001 10000 |
+----------------------------------------------------------+
Thanks for the example. The values of var1
all fall way outside the interval [1, 2] so I have recast the example in terms of var2
.
The most important detail is that the if
command is quite wrong here, as it doesn't imply a loop over observations. You could rewrite the code using the if
qualifier but it is simpler to use cond()
which is an if/else construct. For manipulations like this you shouldn't rely on the default float
storage type.
Reading list:
help cond()
help inrange()
https://www.stata.com/support/faqs/programming/if-command-versus-if-qualifier/
https://www.stata-journal.com/sjpdf.html?articlenum=pr0016
https://www.stata-journal.com/sjpdf.html?articlenum=dm0026
Note that this isn't a problem needing a loop in Stata.