Search code examples
rpivotdataformat

Can you pivot longer within groups then append them back together to create a panel dataset in R?


I have some data that looks like this:

structure(list(...1 = c(NA, "F00001092B - Return", NA, "F00001092A - Return", 
NA, NA, NA, "F000015112 - Return", NA, "0P0001KFG1 - Return", 
NA, "0P0001KFG2 - Return", NA, "0P0001KG31 - Return"), ...2 = c("43220", 
"7.1993099999999997", "43220", "9.9567099999999993", "-N/A", 
NA, "44074", "5.0628000000000002", "44104", "-7.2036899999999999", 
"44104", "-7.71122", "44074", "3.45444"), ...3 = c(43251, 2.50914, 
43251, 7.08661, NA, NA, 44104, -9.33194, 44135, -1.46045, 44135, 
-1.33158, 44104, -6.43119), ...4 = c(43281, 10.58262, 43281, 
9.74265, NA, NA, 44135, -2.18682, 44165, 19.35054, 44165, 19.1103, 
44135, -2.34933), ...5 = c(43312, -2.08777, 43312, -3.51759, 
NA, NA, 44165, 21.84286, 44196, -2.95694, 44196, -3.05413, 44165, 
21.73591), ...6 = c(43343, 15.81356, 43343, 19.44444, NA, NA, 
44196, -0.38177, 44227, 7.80132, 44227, 7.27166, 44196, -5.5202
), ...7 = c(43373, -3.663, 43373, -4.36047, NA, NA, 44227, 7.00191, 
44255, 5.01423, 44255, 8.44608, 44227, 6.63778), ...8 = c(43404, 
-1.46088, 43404, -3.19149, NA, NA, 44255, 4.64376, 44286, 11.27339, 
44286, 8.3409, 44255, 3.08164), ...9 = c(43434, -7.53453, 43434, 
-9.89011, NA, NA, 44286, 7.97628, 44316, 3.1574, 44316, 3.81616, 
44286, 9.15585), ...10 = c(43465, -3.08588, 43465, -4.00697, 
NA, NA, 44316, 6.94333, 44347, 4.03152, 44347, 2.85649, 44316, 
6.65098), ...11 = c(43496, -1.09915, 43496, -0.54446, NA, NA, 
44347, 5.04557, 44377, 6.4516, 44377, 6.63552, 44347, 2.31668
), ...12 = c(43524, 10.30018, 43524, 12.22628, NA, NA, 44377, 
2.83606, 44408, -3.87638, 44408, -3.88994, 44377, 5.59935), ...13 = c(43555, 
4.50815, 43555, 7.47968, NA, NA, 44408, -3.8261, 44439, -1.63488, 
44439, -0.88845, 44408, -4.49692)), class = "data.frame", row.names = c("1", 
"2", "3", "4", "5", "6", "7", "8", "9", "10", "11", "12", "13", 
"14"))

Snippet of the Data Frame This data is essentially organized into groups of two rows. The top row in each group denotes date (looks like this because it was imported from excel), and the bottom row denotes the returns for a firm on the associated date.

I need to place this data into panel format where I have data that looks like the follwing:

Date  Firm  Return
10/2   X    2.04
10/3   X    2.07
10/2   Y    3.4
10/3   Y    4.2

Within groups of two, the data is wide form. Is it possible to pivot the data longer within every two rows then append this back together?

I have tried something like this to no avail:

#Create Row Identifiers to Group the Data
rets$two_days <- c(0, rep(1:(nrow(rets)-1)%/%2))

#Group by every two days and pivot wider
rets%>%group_by(two_days)%>%
  pivot_longer(cols = !1,names_to = "month", values_to = "ret")

Really stumped on how to tackle this problem.


Solution

  • Here's an approach where we start off assigning each row to a "type" of either Return or Date, and a "group" relating to its eventual output row. Then we can reshape long and then wide by "type" to get the shape we want.

    library(tidyverse)
    df %>%
      mutate(type = if_else(row_number() %% 2 == 0, "Return", "Date"),
             group = (row_number() + 1) %/% 2, .before = 0) %>%
      mutate(across(4:last_col(), as.numeric)) %>%
      fill(`...1`, .direction = "up") %>%
      pivot_longer(4:last_col()) %>%
      pivot_wider(names_from = type, values_from = value) %>%
      mutate(Date = janitor::excel_numeric_to_date(Date))
    

    Result

    # A tibble: 84 × 5
       group ...1                name  Date       Return
       <dbl> <chr>               <chr> <date>      <dbl>
     1     1 F00001092B - Return ...2  2018-04-30   7.20
     2     1 F00001092B - Return ...3  2018-05-31   2.51
     3     1 F00001092B - Return ...4  2018-06-30  10.6 
     4     1 F00001092B - Return ...5  2018-07-31  -2.09
     5     1 F00001092B - Return ...6  2018-08-31  15.8 
     6     1 F00001092B - Return ...7  2018-09-30  -3.66
     7     1 F00001092B - Return ...8  2018-10-31  -1.46
     8     1 F00001092B - Return ...9  2018-11-30  -7.53
     9     1 F00001092B - Return ...10 2018-12-31  -3.09
    10     1 F00001092B - Return ...11 2019-01-31  -1.10