Search code examples
rxtsquantmod

Adding Column To Each Quantmod Symbol


Learning R, not sure how to solve this.

library(quantmod)
library(xts)

# get market data
Nasdaq100_Symbols <- c("AAPL", "AAL")
getSymbols(Nasdaq100_Symbols)

# merge them together
nasdaq100 <- data.frame(as.xts(merge(AAPL, AAL)))
#tail(nasdaq100[,1:12],2)

#make percent difference column
nasdaq100$PD <- (((nasdaq100$AAPL.High - nasdaq100$AAPL.Open)/nasdaq100$AAPL.Open) * 100)

I'm trying to add a percent difference column, but the above code will only work for the AAPL symbol (or whichever symbol you use) and not create a PD column for each symbol.

Do you have to add that column somehow before merging with xts, or can I tell R to create it for each symbol in the new merged frame?

Edit: I am doing data training, so I need all symbols to be headers, like:

           AAPL.Ope AAPL.High AAPL.Volume AAL.Open AAL.High

1/3/2007    86.29   86.58      309579900    53.89   56.92
1/4/2007    84.05   85.95      211815100    56.3    59.15
1/5/2007    85.77   86.2       208685400    58.83   59.15

Solution

  • In my experience, it usually makes much more sense to keep your financial data as xts objects, for future manipulation with other technical indicators etc, unless you plan to run a prediction model in say caret in which case converting to a data.frame may make sense.

    Consider keeping your symbols of data as elements of a container, such as

    update_sym_md <- function(sym, env = .GlobalEnv) {
        x <- get(sym, env)
        pd <- setNames((Hi(x) - Op(x)) / Op(x), "PD")
        merge(x, pd)
    }
    
    # Adjust env for location of xts symbol data
    l.syms <- lapply(Nasdaq100_Symbols, update_sym_md, env = .GlobalEnv)
    
    lapply(l.syms, head)
    # [[1]]
    # AAPL.Open AAPL.High AAPL.Low AAPL.Close AAPL.Volume AAPL.Adjusted          PD
    # 2007-01-03     86.29     86.58    81.90      83.80   309579900      10.85709 0.003360760
    # 2007-01-04     84.05     85.95    83.82      85.66   211815100      11.09807 0.022605556
    # 2007-01-05     85.77     86.20    84.40      85.05   208685400      11.01904 0.005013373
    # 2007-01-08     85.96     86.53    85.28      85.47   199276700      11.07345 0.006630991
    # 2007-01-09     86.45     92.98    85.15      92.57   837324600      11.99333 0.075534942
    # 2007-01-10     94.75     97.80    93.45      97.00   738220000      12.56728 0.032190006
    # 
    # [[2]]
    # AAL.Open AAL.High AAL.Low AAL.Close AAL.Volume AAL.Adjusted           PD
    # 2007-01-03    53.89    56.92   53.89     56.30    2955600     54.80361 0.0562256273
    # 2007-01-04    56.30    59.15   53.65     58.84    2614500     57.27610 0.0506217238
    # 2007-01-05    58.83    59.15   57.90     58.29    1656300     56.74072 0.0054394015
    # 2007-01-08    57.30    60.48   57.04     57.93    2163200     56.39028 0.0554974006
    # 2007-01-09    59.44    60.20   57.56     57.90    2098600     56.36108 0.0127860366
    # 2007-01-10    60.03    60.04   57.34     58.93    3892200     57.36371 0.0001666167
    

    Also, if you do want to compare price returns/raw prices across symbols in one xts object, rather than in a data.frame, you might find the qmao package useful.

    For example:

    install.packages("qmao", repos="http://R-Forge.R-project.org", type = "source")
    library(qmao)
    
    pf <- makePriceFrame(Nasdaq100_Symbols)
    head(pf, 3)
    #               AAPL      AAL
    # 2007-01-03 10.85709 54.80361
    # 2007-01-04 11.09807 57.27610
    # 2007-01-05 11.01904 56.74072
    rf <- makeReturnFrame(Nasdaq100_Symbols)
    head(rf)
    
    #                  AAPL           AAL
    # 2007-01-03           NA            NA
    # 2007-01-04  0.021952895  0.0441273684
    # 2007-01-05 -0.007146715 -0.0093913155
    # 2007-01-08  0.004926208 -0.0061951917
    # 2007-01-09  0.079799692 -0.0005179716
    # 2007-01-10  0.046745798  0.0176329011
    

    Update in response to comment from OP:

    To join all the data into one row, try this:

    (Aside: if you're going to use non linear prediction models on this data.frame, make sure you consider scaling your data points across securities in each row first.)

    x.cbind <- do.call(cbind, l.syms)
    head(x.cbind)
    # AAPL.Open AAPL.High AAPL.Low AAPL.Close AAPL.Volume AAPL.Adjusted          PD AAL.Open AAL.High AAL.Low AAL.Close AAL.Volume AAL.Adjusted         PD.1
    # 2007-01-03     86.29     86.58    81.90      83.80   309579900      10.85709 0.003360760    53.89    56.92   53.89     56.30    2955600     54.80361 0.0562256273
    # 2007-01-04     84.05     85.95    83.82      85.66   211815100      11.09807 0.022605556    56.30    59.15   53.65     58.84    2614500     57.27610 0.0506217238
    # 2007-01-05     85.77     86.20    84.40      85.05   208685400      11.01904 0.005013373    58.83    59.15   57.90     58.29    1656300     56.74072 0.0054394015
    # 2007-01-08     85.96     86.53    85.28      85.47   199276700      11.07345 0.006630991    57.30    60.48   57.04     57.93    2163200     56.39028 0.0554974006
    # 2007-01-09     86.45     92.98    85.15      92.57   837324600      11.99333 0.075534942    59.44    60.20   57.56     57.90    2098600     56.36108 0.0127860366
    # 2007-01-10     94.75     97.80    93.45      97.00   738220000      12.56728 0.032190006    60.03    60.04   57.34     58.93    3892200     57.36371 0.0001666167
    
    df.cbind <- data.frame("time" = index(x.cbind), coredata(x.cbind))
    head(df.cbind)
    # time AAPL.Open AAPL.High AAPL.Low AAPL.Close AAPL.Volume AAPL.Adjusted          PD AAL.Open AAL.High AAL.Low AAL.Close AAL.Volume AAL.Adjusted         PD.1
    # 1 2007-01-03     86.29     86.58    81.90      83.80   309579900      10.85709 0.003360760    53.89    56.92   53.89     56.30    2955600     54.80361 0.0562256273
    # 2 2007-01-04     84.05     85.95    83.82      85.66   211815100      11.09807 0.022605556    56.30    59.15   53.65     58.84    2614500     57.27610 0.0506217238
    # 3 2007-01-05     85.77     86.20    84.40      85.05   208685400      11.01904 0.005013373    58.83    59.15   57.90     58.29    1656300     56.74072 0.0054394015
    # 4 2007-01-08     85.96     86.53    85.28      85.47   199276700      11.07345 0.006630991    57.30    60.48   57.04     57.93    2163200     56.39028 0.0554974006
    # 5 2007-01-09     86.45     92.98    85.15      92.57   837324600      11.99333 0.075534942    59.44    60.20   57.56     57.90    2098600     56.36108 0.0127860366
    # 6 2007-01-10     94.75     97.80    93.45      97.00   738220000      12.56728 0.032190006    60.03    60.04   57.34     58.93    3892200     57.36371 0.0001666167
    

    To better understand how qmao functions work, why not look at the examples in the documentation and go from there? ?makeReturnFrame Look at the source code to really understand what's going on (and become a better R programmer at the same time by learning good coding styles)