Search code examples
variablesstata

Stata adding variables complicated


I have a dataset in Stata on housing transactions. Now I have a dataset for which every row is a year in the holding period for each transaction. I am looking to research the probability of a house being sold in a certain year with a probit model. The dummy indicates whether the house was sold in that year or not, 1 being sold.

Now I want to add another variable to my data which contains the holding period of that specific transaction. This is (an example) of what I have now:

dummy year bought current year
0 1620 1621
0 1620 1622
0 1620 1623
1 1620 1624
0 1622 1623
0 1622 1624
0 1622 1625
0 1622 1626
0 1622 1627
1 1622 1628

Then this is what I need it to become

dummy year bought current year holding period
0 1620 1621 4
0 1620 1622 4
0 1620 1623 4
1 1620 1624 4
0 1622 1623 6
0 1622 1624 6
0 1622 1625 6
0 1622 1626 6
0 1622 1627 6
1 1622 1628 6

Solution

  • Assuming you have some kind of id variable for each house:

    egen sold_year = max(current_year), by(house_id)
    gen holding_period = sold_year - year_bought