Search code examples
ryahoo-financequantmodyahoo-api

How to delete NA from Yahoo Finance Data


I would like to retrieve different close rates from Yahoo finance. Unfortunately the vectors have different lengths which are also due to NA. How can I remove these data series to perform a regression?

AMZN <- diff(log(tseries::get.hist.quote(instrument="AMZN", start= START_DATE,  end=END_DATE, quote=c( "Close"), provider= "yahoo", compression="d", retclass="zoo")))
nrow(AMZN) #250

SDAX <- diff(log(tseries::get.hist.quote(instrument="^SDAXI", start= START_DATE,  end=END_DATE, quote=c( "Close"), provider= "yahoo", compression="d", retclass="zoo")))
nrow(SDAX) #254


EURAUD <- diff(log(tseries::get.hist.quote(instrument="EURAUD=X", start= START_DATE,  end=END_DATE, quote=c( "Close"), provider= "yahoo", compression="d", retclass="zoo"))) 
nrow(EURAUD) #260

I then combine the individual data into a vector. Due to the different lengths I have NA data. However, the rows of the NA data have to be cleaned up, otherwise no regression analysis is possible.

zDataPreFX <- merge(SDAX, AMZN, EURAUD)

Solution

  • You can combine all of the data via a merge. Then use na.omit to remove the rows with NA values in them. See the code example below.

    start_date = "2021-01-01" 
    end_date="2021-12-31"
    
    AMZN <- diff(log(tseries::get.hist.quote(instrument = "AMZN", 
                                             start = start_date,  
                                             end = end_date, 
                                             quote = c("Close"), 
                                             provider = "yahoo", 
                                             compression ="d", 
                                             retclass="zoo")))
    
    SDAX <- diff(log(tseries::get.hist.quote(instrument = "^SDAXI", 
                                             start = start_date,  
                                             end = end_date, 
                                             quote=c("Close"),
                                             provider= "yahoo",
                                             compression="d",
                                             retclass="zoo")))
    
    EURAUD <- diff(log(tseries::get.hist.quote(instrument = "EURAUD=X",
                                               start = start_date,  
                                               end = end_date, 
                                               quote=c("Close"),
                                               provider= "yahoo", 
                                               compression="d",
                                               retclass="zoo"))) 
    
    all <- merge(AMZN, SDAX, EURAUD)
    head(all)
                 Close.AMZN   Close.SDAX  Close.EURAUD
    2021-01-04           NA           NA  0.0251187798
    2021-01-05  0.009954627  0.003818769  0.0056628563
    2021-01-06 -0.025211817  0.013925576 -0.0083800866
    2021-01-07  0.007548605  0.010638004 -0.0038076972
    2021-01-08  0.006474567 -0.002391856  0.0008108246
    2021-01-11 -0.021754382 -0.009921183 -0.0001139825
    
    
    all_cleaned <- na.omit(all)
    head(all_cleaned)
                 Close.AMZN   Close.SDAX  Close.EURAUD
    2021-01-05  0.009954627  0.003818769  0.0056628563
    2021-01-06 -0.025211817  0.013925576 -0.0083800866
    2021-01-07  0.007548605  0.010638004 -0.0038076972
    2021-01-08  0.006474567 -0.002391856  0.0008108246
    2021-01-11 -0.021754382 -0.009921183 -0.0001139825
    2021-01-12  0.002123521  0.009845704 -0.0008616212