Search code examples
rgroup-bycross-correlation

Cross Correlation (CCF) in R for each ID / by ID


I am quite new to R. My data looks (simplified) like this:

ID <- c(1,1,1,1,2,2,3,3,3,3,4,4,4)
Affect <- c(0.8, 0.5, NA, 0.8, 0.2, 0.1, 0.7, 1.1, 0.9, 0.5, 0.3, NA, 0.9)
Paranoia <-  c(0.9, 0.6, 0.4, 0.2, 0.1, NA, 0.3, 0.1, 0.9, 1.5, 0.4, 0.1, 0.6)
df <- cbind(ID, Affect, Paranoia)

What I want calculate a cross correlation in R in order to find out if affect preceeds paranoia or the other way round. How can I do this? I tried several ways but never succeeded. Thank you in advance!


Solution

  • We can remove the 'ID's having all of 'Affect' or 'Paranoia' as NA, then replace the remaining NA with 0 (replace_na) and apply the ccf

    library(tseries)
    library(dplyr)
    library(tidyr)
    out <- df %>%
             group_by(ID) %>%
             filter(!(all(is.na(Affect))|all(is.na(Paranoia)))) %>% 
             mutate_at(vars(Affect, Paranoia), replace_na, 0) %>% 
             summarise(ccfout = list(ccf(Affect, Paranoia)))
    
    
    out$ccfout[[1]]
    #
    #Autocorrelations of series ‘X’, by lag
    
    #    -3     -2     -1      0      1      2      3 
    #-0.264 -0.078  0.575  0.229 -0.246 -0.521  0.305 
    out$ccfout[[3]]
    
    #Autocorrelations of series ‘X’, by lag
    
    #    -3     -2     -1      0      1      2      3 
    #-0.163  0.449  0.408 -0.735 -0.490  0.286  0.245 
    

    Or using group_split/map

    library(purrr)
    df %>%
        group_split(ID) %>% 
        map(~ .x %>% 
                mutate_at(vars(Affect, Paranoia), replace_na, 0) %>% 
            {ccf(.$Affect, .$Paranoia)})
    #[[1]]
    
    #Autocorrelations of series ‘X’, by lag
    
    #    -3     -2     -1      0      1      2      3 
    #-0.264 -0.078  0.575  0.229 -0.246 -0.521  0.305 
    
    #[[2]]
    
    #Autocorrelations of series ‘X’, by lag
    
    #0 
    #1 
    
    #[[3]]
    
    #Autocorrelations of series ‘X’, by lag
    
    #    -3     -2     -1      0      1      2      3 
    #-0.163  0.449  0.408 -0.735 -0.490  0.286  0.245 
    
    #[[4]]
    
    #Autocorrelations of series ‘X’, by lag
    
    #    -1      0      1 
    #-0.289  0.954 -0.636 
    

    data

    df <- data.frame(ID, Affect, Paranoia)