Search code examples
rgroup-byregressiongroup

Relative importance for several groups in R


How do I calculate the relative importance using relaimpo package in R when I want to run it for several groups? As an example, in the mtcars dataframe I want to calculate the relative importance of several variables on mpg for every cyl. I calculated the relative importance of the variables on mpg, but I don't know how to make it per group. I tried to insert group_by(cyl) but I did not succeed. How would I do that in R?

library(relaimpo)
df <- mtcars

model <- lm(mpg ~ disp + hp + drat + wt, data=df)
    
rel_importance = calc.relimp(model, type = "lmg", rela=TRUE) 
rel_importance

Solution

  • I'm not familiar with this package but in general if you want to apply a function by group in R you can split the data frame into a list of one data frame per group, and then apply the function to each element of the list.

    In this case:

    cyl_list <- split(df, df$cyl)
    
    rel_importance_cyl <- lapply(
        cyl_list,
        \(df) {
            model <- lm(mpg ~ disp + hp + drat + wt, data = df)
            calc.relimp(model, type = "lmg", rela = TRUE)
        }
    )
    
    names(rel_importance_cyl) # "4" "6" "8"
    

    You can access this list either by name (e.g. rel_importance_cyl[["4"]]) or by index (e.g. rel_importance_cyl[[1]]), to see the values for each group.