Search code examples
rpurrrmodelr

Mapping a model to a permuted data set sometimes returns the model equation, rather than model output


I am trying to use the permute function in modelr with purrr map to calculate the mean values of two categories of data under permutation.

The function behaves as one would expect if I am trying to calculate linear models off of the permuted data sets, as per the example file for modelr::permute (though I am running the linear model inside of a custom function):

library(tidyverse) 
library(modelr)

perms <- permute(mtcars,  1000, mpg)
jlm <- function(df){lm(mpg ~ wt, data = df)}
models3 <- map(perms$perm, jlm)
models3[[1]]
Call:
lm(formula = mpg ~ wt, data = df)

Coefficients:
(Intercept)           wt  
     28.211       -2.524

Now, instead of a linear model, I just want mean values for two categories in that data set. I tried running as follows.

mean_of_vs <- function(df){
  df %>% group_by(vs) %>% summarize(mean(mpg)) %>% spread(vs, `mean(mpg)`) %>%
    rename(zero = `0`, one = `1`)
}

models4 <- map(perms$perm, ~mean_of_vs)

models4[[1]]

but this just returns the function equation, rather than the output of the function

function(df){
  df %>% group_by(vs) %>% summarize(mean(mpg)) %>% spread(vs, `mean(mpg)`) %>%
    rename(zero = `0`, one = `1`)
}

The equation works by itself on a simple data frame.

test <- perms %>% pull(perm) %>% .[[1]] %>% as.data.frame

mean_of_vs(test)
# A tibble: 1 x 2
   zero   one
  <dbl> <dbl>
1  16.6  24.5

So my question is, why doesn't my custom function return a bunch of one line data frames with the mean value of vs = 0 and vs = 1 and how would I get it to do this?

Thanks.


Solution

  • I am glad to meet you.

    modelr::permute produces the data which its class is 'permutation'

    > class(perms[[1]][1][[1]])
    
    [1] "permutation"
    
    

    permutation class has 3 attributes

    data

    The data in this variable

    columns

    columns you permute

    idx

    indexes indicating which rows have been selected

    i think permutation only takes some kinds of formula (like lm and etc.. i am not sure about the formula list).

    So if you want use function you want you have to transform to data.frame/data.table/tibble like below

    mean_of_vs <- function(df){
       df %>%as.data.frame() %>% group_by(vs) %>% summarize(mean(mpg)) %>% spread(vs, `mean(mpg)`) %>%
         rename(zero = `0`, one = `1`)
    }
    

    Then, execute map function with out ~ notation.

    models4 <- map(perms$perm, mean_of_vs)
    

    Then you will get the result

    
    .....
    
    [[97]]
    
    # A tibble: 1 x 2
       zero   one
    
      <dbl> <dbl>
    1  21.4  18.4
    
    
    
    
    [[98]]
    
    # A tibble: 1 x 2
       zero   one
    
      <dbl> <dbl>
    1  20.4  19.7
    .....