Search code examples
rpanel

How to combine and then identify panel data with different data for each quarter?


I have 4 databases for each quarter per year and I would like to merge them to have all 4 quarters in one base. I think about creating an indicator for each quarter but I don't know how to do it. I do not have the quarter mentioned as a variable in the data but the 4 databases (4 quarters) follow the same individuals.

enter image description here


Solution

  • Supposed you have your data frames named "quarter.X", you could list those names using ls and put them into a new list "quarters", using mget in order to rbind them. To rbind multiple data frames we use do.call to call rbind on a list.

    Before rbinding we create the quarter indicator with the replacement function `[<-`; using Map we iterate over the quarters list and add a value= from the sequence 1:4 one by one. Since ls sorts the names alphabetically by default, this is safe if the data frames are names properly. Example:

    quarters <- mget(ls(pattern="quarter.\\d$"))
    
    res <- do.call(rbind, Map(`[<-`, quarters, "quarter", value=1:4))
    res
    #             id         x1         x2          x3 quarter
    # quarter.1.1  1  1.3709584  0.6328626  1.51152200       1
    # quarter.1.2  2 -0.5646982  0.4042683 -0.09465904       1
    # quarter.1.3  3  0.3631284 -0.1061245  2.01842371       1
    # quarter.2.1  1 -0.0627141 -1.3888607  0.63595040       2
    # quarter.2.2  2  1.3048697 -0.2787888 -0.28425292       2
    # quarter.2.3  3  2.2866454 -0.1333213 -2.65645542       2
    # quarter.3.1  1 -2.4404669 -1.7813084  1.89519346       3
    # quarter.3.2  2  1.3201133 -0.1719174 -0.43046913       3
    # quarter.3.3  3 -0.3066386  1.2146747 -0.25726938       3
    # quarter.4.1  1 -1.7631631  0.4554501 -0.60892638       4
    # quarter.4.2  2  0.4600974  0.7048373  0.50495512       4
    # quarter.4.3  3 -0.6399949  1.0351035 -1.71700868       4
    

    If the row names bother you, then use rownames(res) <- NULL afterwards to get consecutive numbers.


    Data:

    set.seed(42)
    quarter.1 <- data.frame(id=1:3, x1=rnorm(3), x2=rnorm(3), x3=rnorm(3))
    quarter.2 <- data.frame(id=1:3, x1=rnorm(3), x2=rnorm(3), x3=rnorm(3))
    quarter.3 <- data.frame(id=1:3, x1=rnorm(3), x2=rnorm(3), x3=rnorm(3))
    quarter.4 <- data.frame(id=1:3, x1=rnorm(3), x2=rnorm(3), x3=rnorm(3))