Search code examples
riterationcorrelationpartial

How to perform partial correlation analysis with a grouping variable on list of datasets


I am using the following reproducible example that might suit my case. If possible, I have been trying to execute a partial correlation analysis between mpg and disp, hp, vs. I am not sure if the pcor function can do that, but the code I have prepared is the following one:

 list = list(mtcars, mtcars)
    list = lapply(list, function(x) x %>%
                    mutate(gear = as.factor(gear)))
    library(ggm)
    lapply(list, function(x) x %>% 
             group_by(gear) %>% pcor(c(mpg, disp,  hp, vs), var(x)))

I am not sure about what it is wrong but I am getting as error:

Error in pcor(., c(mpg, disp, hp, vs), var(x)) : unused argument (var(x))

What would you suggest to sove this ? MoDo you recommend to use different and separated datasets to run this analysis but using a different iterative method (which the way I would like to use)?

Thanks


Solution

  • Your syntax doesn't really make any sense here. The ggm::pcor function takes two arguments, but you are passing three (remember when you use the pipe operator, the result of the last calculation is passed as the first argument of the following function).

    Also the docs say that the first argument, u, should be:

    a vector of integers of length > 1. The first two integers are the indices of variables the correlation of which must be computed. The rest of the vector is the conditioning set.

    As it turns out, this is not quite right, since the example goes on to give a vector of column names to the argument u, which are used to subset the rows and columns of the matrix passed to argument S.

    The bottom line is that, if you want a partial correlation for each level of gear for each data frame in your list, you will need to do something like:

    library(ggm)
    
    lapply(list, function(x) {
      sapply(split(x, x$gear), function(x) {
        pcor(u = c('mpg', 'disp', 'hp', 'vs'), S = var(x))
      })
    })
    #> [[1]]
    #>          3          4          5 
    #> -0.3572209 -0.7089970 -0.2900153 
    #> 
    #> [[2]]
    #>          3          4          5 
    #> -0.3572209 -0.7089970 -0.2900153