Search code examples
rstata

(From Stata to R) Converting egen-by to R


How could I convert this Stata command to R?

I have a database composed of individuals (each person is a row), but I also need some family variables to analyze. In this case, what I want is to identify the total amount of income earned by each family. Each member of a family is an individual in the database, and although I don't have the individuals; identifications, I have a variable that identifies the family. . Since I also know, for each individual, earnings in 2014, in Stata I have this command to create the variable:

egen family_inc = total(annual_inc), by (id_family)

where

family_inc is the total income of a family annual_inc is the total income earned by the individual id_family is the identification of this family in the data

So the command says to Stata: (1) For each member of the id_family; (2) Find all the members of that family; (3) Sum the income earned during 2014; (4) Assign this value to a new variable family_inc.

Could I use group_by() for this? I am very n00b at R. and can't spare some time to do a course now because of deadlines! course(df_damn, mother = FALSE, explicit = 3, !is.numeric("loads of"))


Solution

  • The following Stata code

    webuse iris 
    egen mean_petal_width = total(petwid), by(iris)
    

    is equivalent to the R code.

    iris %>% 
        group_by(Species) %>% 
        mutate(
            # new_var_name   = function of other vars
            mean_petal_width = sum(Petal.Width, na.rm = TRUE)
        )
    

    if the answer is helpful and solves the question, please mark it as solved :)