Search code examples
rdatabasevariablesvar

creating a variable based on other factors using R


My data looks like this:

hh_id indl ind_salary hh_income
1 1 200
1 2 450
1 3 00
2 4 1232
2 5 423

Individuals with the same hh_id lives in the same household so they will have the same household income. And for that the variable hh_income equal the sum of the salary of all persons with the same hh_id;

so my data would look like:

hh_id indl ind_salary hh_income
1 1 200 650
1 2 450 650
1 3 00 650
2 4 1232 1655
2 5 423 1655

Any ideas please;


Solution

  • You can use R base function ave to generate sum of ind_salary grouped by hh_id and get a vector of the same length of ind_salary

    > df$hh_income <- ave(df$ind_salary, df$hh_id, FUN=sum)
    > df
      hh_id indl ind_salary hh_income
    1     1    1        200       650
    2     1    2        450       650
    3     1    3          0       650
    4     2    4       1232      1655
    5     2    5        423      1655