Search code examples
rsplittranspose

Split and transpose large column (4137 rows) into several rows of the same length the row (196 rows, 21 columns) in R


I have a problem with R.

I have created the following two tables.

Sheet 1: "long_strategic_return_event_window"

Date Relative_day daily_return_acquirer CUSIPID
2011-03-14 -10 0.00000000 00187E203280
2011-03-15 -9 0.00000000 00187E203280
2011-03-16 -8 0.02400000 00187E203280
2011-03-17 -7 -0.02343750 00187E203280
.... .... .... ....
.... .... .... ....
2011-04-07 8 -0.03488372 00187E203280
2011-04-08 9 -0.04953146 00187E203280
2011-04-11 10 -0.00422535 00187E203280
2016-01-15 -10 -0.01362530 0028241001499
2016-01-19 -9 -0.00271337 0028241001499
2016-01-20 -8 -0.01162503 0028241001499
2016-01-21 -7 -0.01201201 0028241001499
.... .... .... ....
.... .... .... ....

Sheet 2: "strategic_realizedreturn"

ROW NAME cusip_acquirer RR(-10) RR(-9) RR(-8) RR(-7) RR(-6) RR(-5) RR(-4) RR(-3) RR(-2) RR(-1) RR(0) RR(1) RR(2) RR(3) RR(4) RR(5) RR(6) RR(7) RR(8) RR(9) RR(10)
00187E203280 00187E203 NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA
0028241001499 002824100 NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA
00287Y1092353 00287Y109 NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA
00430U10332 00430U103 NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA
004498101906 004498101 NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA

I am desperately trying to take the column of sheet1$daily_return_acquirer and transpose it in the correct row of sheet2 if the CUSIPID of Sheet 1 is equal to the row name of sheet 2.

Every CUSIPID on Sheet 1 has exactly 21 values of daily_return_acquirer, which fit into the 21 columns of sheet 2. Sheet1$Relative_day should be equal to the rows on sheet 2 so: -10 = RR(-10), -9 = RR(-9), ..., 10=RR(10)

I have created a minimal reproducible example for both sheets to make it more understandable. I hope you can help me!

Sheet 1: "long_strategic_return_event_window"

structure(list(Date = structure(c(15047, 15048, 15049, 15050, 
15051, 15054, 15055, 15056, 15057, 15058, 15061, 15062, 15063, 
15064, 15065, 15068, 15069, 15070, 15071, 15072, 15075, 16815, 
16819, 16820, 16821, 16822, 16825, 16826, 16827, 16828, 16829, 
16832, 16833, 16834, 16835, 16836, 16839, 16840, 16841, 16842, 
16843, 16847, 18057, 18058, 18059, 18060, 18061, 18064, 18065, 
18066, 18067, 18068, 18071, 18072, 18073, 18074, 18075, 18078, 
18079, 18080, 18082, 18085, 18086, 14687, 14690, 14691, 14692, 
14693, 14694, 14697, 14698, 14699, 14700, 14704, 14705, 14706, 
14707, 14708, 14711, 14712, 14713, 14714, 14715, 14718, 15957, 
15958, 15959, 15960, 15961, 15964, 15965, 15966, 15967, 15968, 
15971, 15972, 15973, 15974, 15975, 15978, 15979, 15980, 15981, 
15982, 15985), class = "Date"), relative_day = c(-10L, -9L, -8L, 
-7L, -6L, -5L, -4L, -3L, -2L, -1L, 0L, 1L, 2L, 3L, 4L, 5L, 6L, 
7L, 8L, 9L, 10L, -10L, -9L, -8L, -7L, -6L, -5L, -4L, -3L, -2L, 
-1L, 0L, 1L, 2L, 3L, 4L, 5L, 6L, 7L, 8L, 9L, 10L, -10L, -9L, 
-8L, -7L, -6L, -5L, -4L, -3L, -2L, -1L, 0L, 1L, 2L, 3L, 4L, 5L, 
6L, 7L, 8L, 9L, 10L, -10L, -9L, -8L, -7L, -6L, -5L, -4L, -3L, 
-2L, -1L, 0L, 1L, 2L, 3L, 4L, 5L, 6L, 7L, 8L, 9L, 10L, -10L, 
-9L, -8L, -7L, -6L, -5L, -4L, -3L, -2L, -1L, 0L, 1L, 2L, 3L, 
4L, 5L, 6L, 7L, 8L, 9L, 10L), daily_return_acquirer = c(0, 0, 
0.0240000000000001, -0.0234375000000001, -0.00799999999999997, 
0.0161290322580645, 0.0857142857142857, -0.00584795321637427, 
0.125, 0.0261437908496731, -0.0140127388535031, -0.0116279069767442, 
0.0457516339869281, 0.01875, -0.049079754601227, 0.0129032258064516, 
0, -0.0140127388535031, -0.0348837209302326, -0.0495314591700134, 
-0.00422535211267597, -0.0136253041362531, -0.00271336951159347, 
-0.0116250309176354, -0.0120120120120121, 0.0139311043566364, 
-0.00924306769922569, 0.0126071608673727, 0.00771912350597615, 
-0.0929083271559179, 0.0310542086624898, 0.0158520475561427, 
-0.0148244473342003, 0.0036958817317846, -0.000263019463440429, 
-0.015785319652723, 0.00133654103180979, -0.012279765082755, 
0.0056756756756757, -0.0233808116097822, 0.0217391304347826, 
0.0231618637220576, -0.00619914761720269, 0.0158544509421702, 
0.00511705257771513, 0.00483645157184688, -0.00329322355921476, 
-0.00050832380226194, -0.00508582326764152, -0.00817891373801918, 
0.0094060043808788, 0.00561654327291291, -0.00418888042650417, 
-0.162523900573614, 0.0350076103500761, 0.0294117647058824, 0.0388571428571428, 
0.00935093509350944, 0.0113079019073569, 0.00983429880102371, 
-0.0262806830309498, -0.0242498972461981, 0.000280819994383544, 
0.00761035007610347, -0.00755287009063441, 0.004566210045662, 
-0.0212121212121212, -0.00928792569659437, 0.0046874999999999, 
-0.0171073094867806, -0.0348101265822786, 0.00983606557377057, 
0.0551948051948052, 0.0784615384615384, 0.00713266761768899, 
-0.00141643059490082, 0.00425531914893621, 0.00423728813559326, 
-0.0295358649789029, 0.0159420289855072, 0.00427960057061345, 
0.0198863636363636, -0.0027855153203342, 0.00139664804469271, 
0.00590511097536142, 0.00485829959514174, 0.0394842868654311, 
-0.00523255813953494, 0.00331190337034876, -0.00349514563106796, 
0.0134450506625097, 0.00596039223226307, -0.000382262996941956, 
0.011281070745698, 0.00226885989790126, 0.00811167704206753, 
0.00243263473053897, 0.00746686578308752, 0.00222345747637585, 
-0.00055463117027179, 0.0177580466148724, 0.00308978553253353, 
-0.0108715346983148, -0.00696098186481045, -0.0084855192768862
), CusipID = c("00187E203280", "00187E203280", "00187E203280", 
"00187E203280", "00187E203280", "00187E203280", "00187E203280", 
"00187E203280", "00187E203280", "00187E203280", "00187E203280", 
"00187E203280", "00187E203280", "00187E203280", "00187E203280", 
"00187E203280", "00187E203280", "00187E203280", "00187E203280", 
"00187E203280", "00187E203280", "0028241001499", "0028241001499", 
"0028241001499", "0028241001499", "0028241001499", "0028241001499", 
"0028241001499", "0028241001499", "0028241001499", "0028241001499", 
"0028241001499", "0028241001499", "0028241001499", "0028241001499", 
"0028241001499", "0028241001499", "0028241001499", "0028241001499", 
"0028241001499", "0028241001499", "0028241001499", "00287Y1092353", 
"00287Y1092353", "00287Y1092353", "00287Y1092353", "00287Y1092353", 
"00287Y1092353", "00287Y1092353", "00287Y1092353", "00287Y1092353", 
"00287Y1092353", "00287Y1092353", "00287Y1092353", "00287Y1092353", 
"00287Y1092353", "00287Y1092353", "00287Y1092353", "00287Y1092353", 
"00287Y1092353", "00287Y1092353", "00287Y1092353", "00287Y1092353", 
"00430U10332", "00430U10332", "00430U10332", "00430U10332", "00430U10332", 
"00430U10332", "00430U10332", "00430U10332", "00430U10332", "00430U10332", 
"00430U10332", "00430U10332", "00430U10332", "00430U10332", "00430U10332", 
"00430U10332", "00430U10332", "00430U10332", "00430U10332", "00430U10332", 
"00430U10332", "004498101906", "004498101906", "004498101906", 
"004498101906", "004498101906", "004498101906", "004498101906", 
"004498101906", "004498101906", "004498101906", "004498101906", 
"004498101906", "004498101906", "004498101906", "004498101906", 
"004498101906", "004498101906", "004498101906", "004498101906", 
"004498101906", "004498101906")), row.names = c(NA, -105L), class = c("data.table", 
"data.frame"), .internal.selfref = <pointer: 0x0000022eca291ef0>)

Sheet 2: "strategic_realizedreturn"

structure(list(cusip_acquirer = structure(1:5, .Label = c("00187E203", 
"002824100", "00287Y109", "00430U103", "004498101"), class = "factor"), 
    `RR(-10)` = c("NA", "NA", "NA", "NA", "NA"), `RR(-9)` = c("NA", 
    "NA", "NA", "NA", "NA"), `RR(-8)` = c("NA", "NA", "NA", "NA", 
    "NA"), `RR(-7)` = c("NA", "NA", "NA", "NA", "NA"), `RR(-6)` = c("NA", 
    "NA", "NA", "NA", "NA"), `RR(-5)` = c("NA", "NA", "NA", "NA", 
    "NA"), `RR(-4)` = c("NA", "NA", "NA", "NA", "NA"), `RR(-3)` = c("NA", 
    "NA", "NA", "NA", "NA"), `RR(-2)` = c("NA", "NA", "NA", "NA", 
    "NA"), `RR(-1)` = c("NA", "NA", "NA", "NA", "NA"), `RR(0)` = c("NA", 
    "NA", "NA", "NA", "NA"), `RR(1)` = c("NA", "NA", "NA", "NA", 
    "NA"), `RR(2)` = c("NA", "NA", "NA", "NA", "NA"), `RR(3)` = c("NA", 
    "NA", "NA", "NA", "NA"), `RR(4)` = c("NA", "NA", "NA", "NA", 
    "NA"), `RR(5)` = c("NA", "NA", "NA", "NA", "NA"), `RR(6)` = c("NA", 
    "NA", "NA", "NA", "NA"), `RR(7)` = c("NA", "NA", "NA", "NA", 
    "NA"), `RR(8)` = c("NA", "NA", "NA", "NA", "NA"), `RR(9)` = c("NA", 
    "NA", "NA", "NA", "NA"), `RR(10)` = c("NA", "NA", "NA", "NA", 
    "NA")), row.names = c("00187E203280", "0028241001499", "00287Y1092353", 
"00430U10332", "004498101906"), class = "data.frame")

Solution

  • Unless I'm mistaken (and I might be because you haven't filled in any of the values in the shell of your desired output), this is simply a pivot_wider combined with the addition of row names from the output shell:

    long_strategic_return_event_window %>% 
      select(-Date) %>% 
      pivot_wider(
        names_from=relative_day, 
        names_prefix="RR", 
        values_from=daily_return_acquirer
      ) %>% 
      add_column(
        `ROW NAME`=rownames(strategic_realizedreturn), 
        .before=1
      )
    
    

    Giving

    # A tibble: 5 x 23
      `ROW NAME` CusipID  `RR-10`   `RR-9`   `RR-8`   `RR-7`   `RR-6`   `RR-5`   `RR-4`   `RR-3`   `RR-2`  `RR-1`      RR0      RR1      RR2      RR3      RR4      RR5     RR6      RR7
      <chr>      <chr>      <dbl>    <dbl>    <dbl>    <dbl>    <dbl>    <dbl>    <dbl>    <dbl>    <dbl>   <dbl>    <dbl>    <dbl>    <dbl>    <dbl>    <dbl>    <dbl>   <dbl>    <dbl>
    1 00187E203… 00187E…  0        0        0.024   -0.0234  -0.00800  1.61e-2  0.0857  -0.00585  1.25e-1 0.0261  -0.0140  -0.0116   0.0458   1.88e-2 -0.0491   1.29e-2  0      -0.0140 
    2 002824100… 002824… -0.0136  -0.00271 -0.0116  -0.0120   0.0139  -9.24e-3  0.0126   0.00772 -9.29e-2 0.0311   0.0159  -0.0148   0.00370 -2.63e-4 -0.0158   1.34e-3 -0.0123  0.00568
    3 00287Y109… 00287Y… -0.00620  0.0159   0.00512  0.00484 -0.00329 -5.08e-4 -0.00509 -0.00818  9.41e-3 0.00562 -0.00419 -0.163    0.0350   2.94e-2  0.0389   9.35e-3  0.0113  0.00983
    4 00430U103… 00430U…  0.00761 -0.00755  0.00457 -0.0212  -0.00929  4.69e-3 -0.0171  -0.0348   9.84e-3 0.0552   0.0785   0.00713 -0.00142  4.26e-3  0.00424 -2.95e-2  0.0159  0.00428
    5 004498101… 004498…  0.00591  0.00486  0.0395  -0.00523  0.00331 -3.50e-3  0.0134   0.00596 -3.82e-4 0.0113   0.00227  0.00811  0.00243  7.47e-3  0.00222 -5.55e-4  0.0178  0.00309
    # … with 3 more variables: RR8 <dbl>, RR9 <dbl>, RR10 <dbl>
    

    Though, personally, I'd choose slightly more user friendly-column names for R-10 etc...