Search code examples
rdataframedplyrreshapereshape2

Rearranging data frame in R for panel analysis


I have problems with rearranging my data rame so that it is suitable for panel analysis. The raw data looks like this (there are all countries and 50 years, that's just head):

head(suicide_data_panel)

country     variable  1970  1971 
Afghanistan suicide   NA    NA          
Afghanistan unempl    NA    NA          
Afghanistan hci       NA    NA      
Afghanistan gini      NA    NA          
Afghanistan inflation NA    NA          
Afghanistan cpi       NA    NA          

I would like it to be:

country     year    suicide  unempl 
Afghanistan 1970      NA    NA          
Afghanistan 1971      NA    NA          
Afghanistan 1972      NA    NA      
Afghanistan 1973      NA    NA          
Afghanistan 1974      NA    NA          
Afghanistan 1975      NA    NA

So that I can run panel regression. I've tried to use dcast but I don't know how to make it account for different years:

suicide <- dcast(suicide_data_panel, country~variable, sum)

This command will result in taking the last year only into account:

head(suicide)

    country         account    alcohol     
1   Afghanistan     -18.874843  NA  
2   Albania         -6.689212   NA  
3   Algeria         NA          NA  
4   American Samoa  NA          NA      
5   Andorra         NA          NA      
6   Angola          7.000035    NA

It sorts variables alphabetically. Please help.


Solution

  • You coul try to use the tidyverse package:

    library(tidyverse)
    
    suicide_data_panel %>%
      gather(year, dummy, -country, -variable) %>%
      spread(variable, dummy)