Search code examples
stata

Calculate a fraction based on multiple conditions


I have data in Stata in this format:

Value Treatment Area
1.1 1 1
2.5 1 2
3.4 0 3
6.7 0 1
3.6 0 2
2.9 1 3
2.5 0 1
7.8 1 3

I want to get a sum of the values in "Value" by area and by treatment status. This part I was able to do using:

by Area Treatment, sort: egen desired_variable = total(Value)

What I want is for each row of the dataset, a sum of the values in "Value" for all the treated AND untreated observations in the area. The issue with this is that it gives me the total for Treatment = 1 and Treatment = 0 for each Area, depending on its own treatment status.

I also tried:

by Area, sort: egen desired_variable = total(Value) if treatment == 1

and

by Area, sort: egen desired_variable = total(Value) if treatment == 0

But this is a different version of the same problem, since in a row where the treatment is = 0, the first line of code will just give an NA, and vice versa with the second one.


Solution

  • This kind of calculation is sometimes needed:

    clear
    input float value byte(treatment area)
    1.1 1 1
    2.5 1 2
    3.4 0 3
    6.7 0 1
    3.6 0 2
    2.9 1 3
    2.5 0 1
    7.8 1 3
    end
    
    bysort area: egen total1 = total(cond(treatment == 1, value, .)) 
    by area: egen total0 = total(cond(treatment == 0, value, .)) 
    

    See also Section 9 in https://journals.sagepub.com/doi/pdf/10.1177/1536867X1101100210