Search code examples
rfinanceyahooquantmod

How can I delete NA´s within data downloaded daily from Yahoo Finance with the use of getsymbols function?


Underlyings <- c ("AMZN", "ALV.DE", "BMW.DE")
getsymbols(Underlyings, from = "", to = "")

now an if or for loop for eliminating the existing NA's?


Solution

  • What you could do is use lapply and use that to call getSymbols with na.omit around it. Now when you call getSymbols the object will be placed directly into your environment and the na.omit can't find anything to do it's work, but you do not get a warning / error. If you use auto.assign = FALSE when using getSymbols you can assign the values yourself and the returning result from getSymbols can be passed on to na.omit. You will still get the warning that SAF.PA has empty values but in the list the values will have been removed.

    EDIT based on github script

    One of the stocks (EI.PA) in the stock list gives an error that it can not be downloaded. I added try around the function to catch this so it continues with the next stock.

    library(quantmod)
    underlyings <- c("^STOXX50E", "ALV.DE", "G.MI", "BMW.DE", "SU.PA", "ENI.MI", "IBE.MC", "ORA.PA", "DBK.DE",
                 "BAYN.DE", "ENEL.MI", "AI.PA", "DTE.DE", "BN.PA", "SAF.PA", "BBVA.MC","PHIA.AS", 
                 "OR.PA", "ASML.AS", "DPW.DE", "AIR.PA", "BNP.PA", "INGA.AS", "ENGI.PA", "ABI.BR", 
                 "EI.PA", "SAN.PA", "CA.PA", "ITX.MC", "MC.PA", "FRE.DE")
    
    my_data <- lapply(underlyings, function(x) try(na.omit(getSymbols(x, from="2016-01-01", to="2019-01-08", auto.assign = FALSE))))
    names(my_data) <- underlyings
    
    sapply(my_data, function(x) sum(is.na(x)))
    
    Warning: EI.PA download failed; trying again.
    Error : EI.PA download failed after two attempts. Error message:
    HTTP error 404.
    In addition: Warning messages:
    1: ^STOXX50E contains missing values. Some functions will not work if objects contain missing values in the middle of the series. Consider using na.omit(), na.approx(), na.fill(), etc to remove or replace them. 
    2: SU.PA contains missing values. Some functions will not work if objects contain missing values in the middle of the series. Consider using na.omit(), na.approx(), na.fill(), etc to remove or replace them. 
    3: SAF.PA contains missing values. Some functions will not work if objects contain missing values in the middle of the series. Consider using na.omit(), na.approx(), na.fill(), etc to remove or replace them. 
    4: ASML.AS contains missing values. Some functions will not work if objects contain missing values in the middle of the series. Consider using na.omit(), na.approx(), na.fill(), etc to remove or replace them. 
    Warning message:
    SAN.PA contains missing values. Some functions will not work if objects contain missing values in the middle of the series. Consider using na.omit(), na.approx(), na.fill(), etc to remove or replace them. 
    
    # show number of empty values
    sapply(my_data, function(x) sum(is.na(x)))
    
    sapply(my_data, function(x) sum(is.na(x)))
    ^STOXX50E    ALV.DE      G.MI    BMW.DE     SU.PA    ENI.MI    IBE.MC    ORA.PA    DBK.DE   BAYN.DE   ENEL.MI     AI.PA    DTE.DE 
            0         0         0         0         0         0         0         0         0         0         0         0         0 
        BN.PA    SAF.PA   BBVA.MC   PHIA.AS     OR.PA   ASML.AS    DPW.DE    AIR.PA    BNP.PA   INGA.AS   ENGI.PA    ABI.BR     EI.PA 
            0         0         0         0         0         0         0         0         0         0         0         0         0 
       SAN.PA     CA.PA    ITX.MC     MC.PA    FRE.DE 
            0         0         0         0         0 
    

    To remove the errors from the list:

    my_data[which(sapply(my_data, function(x) inherits(x, "try-error")) == TRUE)] <- NULL
    
    # to create one big xts object:
    my_big_xts <- Reduce(cbind, my_data)
    

    But if you want to have multiple ticker symbols in a tidy data.frame you might want to look into the tidyquant package.