Search code examples
rtime-seriespanel

reshape time series into panel in r


Now I have time series GDP data for 3 countries. I would like to create a panel for dataset, for further panel analysis. I don't understand how to create it whith reshape package a plm.

           AT      CZ       DE
 1995    68410.7 30457.3 630631.5
 1995.25 68353.5 30213.1 625515.3
 1995.5  68103.3 29766.4 623124.0
 1995.75 67896.0 29661.8 621122.0
 1996    67888.8 29595.8 616673.1
 1996.25 67874.5 29880.0 616645.4

I've found that I can reshape data in such way:

long <- reshape(as.data.frame(GDP.series),varying = list(names(GDP.series)), v.names="GDP",
            timevar = "Country", idvar = "time", ids = row.names(GDP.series),
            times = names(GDP.series),   new.row.names = 1:((dim(GDP.series)[2])*(dim(GDP.series)[1])),direction = "long")

And after that data are in looks like:

    Country     GDP
 1      AT 49149.0
 2      AT 49555.5
 3      AT 49475.9
 4      AT 49507.6
 5      AT 49888.9
 6      AT 50324.5

But the problem with this transformation is that the infromation about time periods is lost. I'm quite beginner, and not everything in code behind is understable for me, especially this part:

  "new.row.names = 1:((dim(GDP.series)[2])*(dim(GDP.series)[1])),direction = "long""

So know my question is how to improve/change code in case data have following format:

       Country     GDP
2013       AT 49149.0
2012.75    AT 49555.5
2012.5     AT 49475.9
2012.25    AT 49507.6
2011       AT 49888.9
2011.75    AT 50324.5

Or if I need to use some other function? Thank you in advance. (Code is taken from this topic: Data Transformation in R for Panel Regression)


Solution

  • This shall answer yours! however remember that rownames of a dataframe must be unique and so you cannot have that. check my output:

    data = data.frame(AT = 1:6,CZ = 11:16,DE = 21:26)
    rownames(data) = c(2013,2012.75, 2012.5  ,2012.25  ,2011,2011.75)
    data$row = rownames(data)
    
    library(reshape2)
    data1 = melt(data, id.vars = "row", measure.vars = c("AT","CZ","DE"),
         value.name = "GDP", variable.name = "Country")
    data1
           row Country GDP
    1     2013      AT   1
    2  2012.75      AT   2
    3   2012.5      AT   3
    4  2012.25      AT   4
    5     2011      AT   5
    6  2011.75      AT   6
    7     2013      CZ  11
    8  2012.75      CZ  12
    9   2012.5      CZ  13
    10 2012.25      CZ  14
    11    2011      CZ  15
    12 2011.75      CZ  16
    13    2013      DE  21
    14 2012.75      DE  22
    15  2012.5      DE  23
    16 2012.25      DE  24
    17    2011      DE  25
    18 2011.75      DE  26
    

    And if what you sought is a list of dataframes each for a Country then use dlply() :

    library(plyr)
    dlply(data1, .(Country), function(x) {rownames(x) = x$row;x$row = NULL;x})
    
    $AT
            Country GDP
    2013         AT   1
    2012.75      AT   2
    2012.5       AT   3
    2012.25      AT   4
    2011         AT   5
    2011.75      AT   6
    
    $CZ
            Country GDP
    2013         CZ  11
    2012.75      CZ  12
    2012.5       CZ  13
    2012.25      CZ  14
    2011         CZ  15
    2011.75      CZ  16
    
    $DE
            Country GDP
    2013         DE  21
    2012.75      DE  22
    2012.5       DE  23
    2012.25      DE  24
    2011         DE  25
    2011.75      DE  26