I have 1200 stocks with different time periods all as a data frame and need to edit it slightly. Currently what I have looks like
Date Ticker Price
30/04/2006 ZZ 15.95
31/05/2006 ZZ 13.13
30/06/2006 ZZ 13.27
31/07/2006 ZZ 13.66
31/08/2015 ZYNE 28.11
30/09/2015 ZYNE 14.31
31/10/2015 ZYNE 12.59
30/06/2014 ZSPH 28.75
31/07/2014 ZSPH 28.01
31/08/2014 ZSPH 40.29
30/09/2014 ZSPH 39.23
31/10/2014 ZSPH 37.62
30/11/2014 ZSPH 42.96
31/12/2014 ZSPH 41.57
31/01/2015 ZSPH 44.93
28/02/2015 ZSPH 49.47
i.e. there's different time periods used, with different numbers of rows, so it's not able to split every nth row into a new column
What I'd like in the end is a data frame something along the lines of
Date ZZ ZYNE ZSPH
30/04/2006 15.95 N/A N/A
31/05/2006 13.13 N/A N/A
30/06/2006 13.27 N/A N/A
31/07/2006 13.66 N/A N/A
31/08/2006 N/A N/A N/A
30/09/2006 N/A N/A N/A
31/10/2006 N/A N/A N/A
30/11/2006 N/A N/A N/A
but with much longer time lengths and columns used as I have a load more to run from 2004 - 2018
From there I need to use the data on Fama French's website to calculate the expected return using the 4-factor model on all stocks. I've found YouTube videos on how to do it, but I've found I need the data to be in this format
Thanks!!
(edit: clarifying what I have is currently a df and needs to be edited slightly)
This creates a zoo object of the form shown in the question. It also works if text = Lines
is replaced with a file name or 3 column input data frame with the columns shown in the question. Use fortify.zoo(z)
if you need a data frame.
library(zoo)
z <- read.zoo(text = Lines, header = TRUE, split = "Ticker", format = "%d/%m/%Y")
giving:
> z
ZSPH ZYNE ZZ
2006-04-30 NA NA 15.95
2006-05-31 NA NA 13.13
2006-06-30 NA NA 13.27
2006-07-31 NA NA 13.66
2014-06-30 28.75 NA NA
2014-07-31 28.01 NA NA
2014-08-31 40.29 NA NA
2014-09-30 39.23 NA NA
2014-10-31 37.62 NA NA
2014-11-30 42.96 NA NA
2014-12-31 41.57 NA NA
2015-01-31 44.93 NA NA
2015-02-28 49.47 NA NA
2015-08-31 NA 28.11 NA
2015-09-30 NA 14.31 NA
2015-10-31 NA 12.59 NA
The input in reproducible form is:
Lines <- "
Date Ticker Price
30/04/2006 ZZ 15.95
31/05/2006 ZZ 13.13
30/06/2006 ZZ 13.27
31/07/2006 ZZ 13.66
31/08/2015 ZYNE 28.11
30/09/2015 ZYNE 14.31
31/10/2015 ZYNE 12.59
30/06/2014 ZSPH 28.75
31/07/2014 ZSPH 28.01
31/08/2014 ZSPH 40.29
30/09/2014 ZSPH 39.23
31/10/2014 ZSPH 37.62
30/11/2014 ZSPH 42.96
31/12/2014 ZSPH 41.57
31/01/2015 ZSPH 44.93
28/02/2015 ZSPH 49.47
"