Search code examples
rfinanceyahoo-finance

Extracting stock prices with different dates and row numbers


I have 1200 stocks with different time periods all as a data frame and need to edit it slightly. Currently what I have looks like

Date        Ticker Price
30/04/2006  ZZ     15.95
31/05/2006  ZZ     13.13
30/06/2006  ZZ     13.27
31/07/2006  ZZ     13.66
31/08/2015  ZYNE   28.11
30/09/2015  ZYNE   14.31
31/10/2015  ZYNE   12.59
30/06/2014  ZSPH   28.75 
31/07/2014  ZSPH    28.01
31/08/2014  ZSPH    40.29 
30/09/2014  ZSPH    39.23
31/10/2014  ZSPH    37.62
30/11/2014  ZSPH    42.96
31/12/2014  ZSPH    41.57
31/01/2015  ZSPH    44.93
28/02/2015  ZSPH    49.47

i.e. there's different time periods used, with different numbers of rows, so it's not able to split every nth row into a new column

What I'd like in the end is a data frame something along the lines of

Date        ZZ     ZYNE     ZSPH
30/04/2006  15.95  N/A      N/A
31/05/2006  13.13  N/A      N/A
30/06/2006  13.27  N/A      N/A
31/07/2006  13.66  N/A      N/A
31/08/2006  N/A    N/A      N/A
30/09/2006  N/A    N/A      N/A
31/10/2006  N/A    N/A      N/A
30/11/2006  N/A    N/A      N/A

but with much longer time lengths and columns used as I have a load more to run from 2004 - 2018

From there I need to use the data on Fama French's website to calculate the expected return using the 4-factor model on all stocks. I've found YouTube videos on how to do it, but I've found I need the data to be in this format

Thanks!!

(edit: clarifying what I have is currently a df and needs to be edited slightly)


Solution

  • This creates a zoo object of the form shown in the question. It also works if text = Lines is replaced with a file name or 3 column input data frame with the columns shown in the question. Use fortify.zoo(z) if you need a data frame.

    library(zoo)
    z <- read.zoo(text = Lines, header = TRUE, split = "Ticker", format = "%d/%m/%Y")
    

    giving:

    > z
                ZSPH  ZYNE    ZZ
    2006-04-30    NA    NA 15.95
    2006-05-31    NA    NA 13.13
    2006-06-30    NA    NA 13.27
    2006-07-31    NA    NA 13.66
    2014-06-30 28.75    NA    NA
    2014-07-31 28.01    NA    NA
    2014-08-31 40.29    NA    NA
    2014-09-30 39.23    NA    NA
    2014-10-31 37.62    NA    NA
    2014-11-30 42.96    NA    NA
    2014-12-31 41.57    NA    NA
    2015-01-31 44.93    NA    NA
    2015-02-28 49.47    NA    NA
    2015-08-31    NA 28.11    NA
    2015-09-30    NA 14.31    NA
    2015-10-31    NA 12.59    NA
    

    Note

    The input in reproducible form is:

    Lines <- "
    Date        Ticker Price
    30/04/2006  ZZ     15.95
    31/05/2006  ZZ     13.13
    30/06/2006  ZZ     13.27
    31/07/2006  ZZ     13.66
    31/08/2015  ZYNE   28.11
    30/09/2015  ZYNE   14.31
    31/10/2015  ZYNE   12.59
    30/06/2014  ZSPH   28.75 
    31/07/2014  ZSPH    28.01
    31/08/2014  ZSPH    40.29 
    30/09/2014  ZSPH    39.23
    31/10/2014  ZSPH    37.62
    30/11/2014  ZSPH    42.96
    31/12/2014  ZSPH    41.57
    31/01/2015  ZSPH    44.93
    28/02/2015  ZSPH    49.47
    "