Search code examples
rpanel

How to create a two way table for a panel data set?


I have the following data set in R:

Country     Year       Population
   A        2000         1,000
   A        2001         1,100
   A        2002         1,200
   B        2000         1,150
   B        2001       
   B        2003         1,400
   C        2000       
   C        2001         1,000
   C        2003         1,100

Where the empty spaces represent missing values. I am trying to create a two way table with the years of available data for the population column. Something like this:

Country  2000  2001  2002
   A       1     1     1
   B       1     0     1
   C       0     1     1

Solution

  • You can use dcast to convert your data into wide format. Also, you can use spread from dplyr.

    Method 1:

    library(data.table)
    dcast(df[!is.na(df$Population),], formula = Country ~ Year, fun.aggregate = length)
    
    print(df)
      Country 2000 2001 2002 2003
    1       A    1    1    1    0
    2       B    1    0    0    1
    3       C    0    1    0    1
    

    Method 2:

    df %>% 
        mutate(row_id = if_else(is.na(Population),0,1)) %>% 
        select(-Population) %>% 
        spread(Year, row_id,fill=0)
    
      Country 2000 2001 2002 2003
    1       A    1    1    1    0
    2       B    1    0    0    1
    3       C    0    1    0    1