Search code examples
rjoindimensions

filter variables to connect two dataframes over different dimensions


There are two dataframes which I want to connect.

So because I have 2 dimensions to filter a value from the column of the second table which meets some conditions of the first table. The first dataframe looks like this:

 letter   year  value
    A        2001   
    B        2002
    C        2003
    D        2004

second one:

       letter  2001  2002 2003 2004
        A         4     9    9   9
        B         6      7   6    6  
        C         2      3   5    8 
        D         1       1  1    1

which gives me something like this

letter year    value
A       2001    4
B       2002    7
C       2003    5
D       2004    1

thank all of you

Solution

  • One option is to row/column index. Here, the row index can be sequence of rows, while the column index we get from matching the 'year' column of first data with the column names of second, cbind the indexes to create a matrix ('m1') and use that to extract values from second dataset and assign those to 'value' column in first data

    i1 <- seq_len(nrow(df1))
    j1 <- match(df1$year, names(df2)[-1])
    m1 <- cbind(i1, j1)
    df1$value <- df2[-1][m1]
    df1
    #   letter year value
    #1      A 2001     4
    #2      B 2002     7
    #3      C 2003     5
    #4      D 2004     1
    

    For the specific example, the pattern to extract seems to be the diagonal elements, in that case, we can also use

    df1$value <- diag(as.matrix(df2[-1])) 
    

    data

    df1 <- structure(list(letter = c("A", "B", "C", "D"), year = 2001:2004),
    class = "data.frame", row.names = c(NA, 
    -4L))
    
    df2 <- structure(list(letter = c("A", "B", "C", "D"), `2001` = c(4L, 
    6L, 2L, 1L), `2002` = c(9L, 7L, 3L, 1L), `2003` = c(9L, 6L, 5L, 
    1L), `2004` = c(9L, 6L, 8L, 1L)), class = "data.frame", 
    row.names = c(NA, 
    -4L))