I want to download stock data from different dates, normalize their returns and align the time to time 0 and plot the series in a single plot so that I can compare visually their performance over time.
I have the following script:
library(quantmod)
library(ggplot2)
Symbols=spl('AAPL,C')
dates_from=spl('2000-1-1,2008-1-1')
dates_to=spl('2000-12-1,2008-12-1')
noS = length(Symbols)
dataEnv<-new.env()
for(i in 1:noS) getSymbols(Symbols[[i]], src = "yahoo", from=dates_from[[i]], to=dates_to[[i]],env=dataEnv, auto.assign = T, use.Adjusted=T)
for(i in ls(dataEnv)) dataEnv[[i]] = Ad(dataEnv[[i]])#Use adjusted data
# Normalize the series to start from the same point (time 0)
for (i in ls(dataEnv)) {
dataEnv[[i]] = dataEnv[[i]]/ coredata(dataEnv[[i]])[1]
}
for (i in ls(dataEnv)) {
dataEnv[[i]] = data.frame(Date = 1:length(dataEnv[[i]]), Price = coredata(dataEnv[[i]]))
}
# Ensure column names are correct and create a single plot for each symbol in dataEnv
for (symbol in ls(dataEnv)) {
df <- dataEnv[[symbol]]
names(df) <- c("Date", "Price") # Ensure correct column names
plot <- ggplot(df, aes(x = Date, y = Price)) +
geom_line() +
ggtitle(paste("Normalized Price Series for", symbol)) +
xlab("Time") +
ylab("Normalized Price") +
theme_minimal()
print(plot)
}
# I want to draw all the series into a single plot instead
I was able to plot multiple separate plots, but I faced problems combining the different series into a single plot. Various solutions I tried combined the dataframes with a lot of NAs because the series had different dates.
The easiest way to plot all individual series in a single plot is to combine them in a single dataframe. I have tweaked your entire workflow to streamline the process, this will not affect your existing plot code.
Note that there are syntax errors in your code, these have been addressed below.
library(quantmod)
library(ggplot2)
Symbols <- strsplit("AAPL,C", ",")[[1]]
dates_from <- as.Date(strsplit("2000-1-1,2008-1-1", ",")[[1]])
dates_to <- as.Date(strsplit("2000-12-1,2008-12-1", ",")[[1]])
noS <- length(Symbols)
dataEnv <- new.env()
# Download and process data
for(i in 1:noS) {
getSymbols(Symbols[[i]],
src = "yahoo",
from = dates_from[[i]],
to = dates_to[[i]],
env = dataEnv,
auto.assign = TRUE,
use.Adjusted = TRUE)
dataEnv[[Symbols[i]]] <- Ad(dataEnv[[Symbols[i]]])
colnames(dataEnv[[Symbols[i]]]) <- "Price"
}
# Normalize the series to start from the same point, add Symbol column
for (symbol in ls(dataEnv)) {
dataEnv[[symbol]] <- dataEnv[[symbol]] / coredata(dataEnv[[symbol]])[1]
}
for (i in ls(dataEnv)) {
dataEnv[[i]] = data.frame(Date = 1:length(dataEnv[[i]]),
Price = coredata(dataEnv[[i]]),
Symbol = i)
}
# Function to retrieve xts objects from dataEnv
df_list <- lapply(ls(dataEnv), function(symbol) {
data <- coredata(dataEnv[[symbol]])
})
# Combine all dataEnv dataframes
df <- do.call(rbind, df_list)
# Plot all series in single plot, use Symbol column to group and colour each series
ggplot(df, aes(x = Date, y = Price, group = Symbol, colour = Symbol)) +
geom_line() +
ggtitle("Normalized Price Series") +
xlab("Time") +
ylab("Normalized Price") +
theme_minimal()