Search code examples
statasurveydummy-variable

Grouping observations by ID while also creating characteristic variables


I'm working with a survey for individuals in Ecuador and I want to analyse the characteristics of each household. Every individual has a houseID, so I guess I would need to group them using that variable while also creating some extra variables regarding their characteristics: for example, a dummy that is 1 if the household has two women or more. I will post an example below.

I would know how to do this in R (group_by), but I haven't found a similar command in Stata.

A simplified version of my data would be:

houseID         femaleDummy   maleDummy
10000000001     1             0
10000000001     1             0
10000000001     0             1
10000000002     0             1
10000000002     0             1

And I would like to get something like

houseID         twoFemalesormoreDummy
10000000001     1
10000000002     0

Solution

  • very easy my friend

    gen d_female = femaleDummy == 1   
    bysort houseID: egen total_female = total(d_female)
    bysort houseID: gen dummy = total_female >= 2