I am trying to create a workflow that collects information (stock ticker data, 30 different tickers, with three different prices associated with an individual ticker) from a website, cleans the data (adds a date column relative to the day the information was collected), pushes it into a master file tsibble
dataframe that holds new data points everyday and then graphs the price ranges on individual plots compiled onto one page.
Example df for one day below to be pushed into master df to hold all the data:
df <- data.frame(ticker = c("XLU", "XLK", "XLF", "XLE", "XLP"),
buy_price = c(62.00, 68.00, 37.00, 55.00, 41.00),
sale_price = c(64.00, 71.00, 42.00, 60.00, 45.00),
close_price = c(63.00, 70.00, 38.00, 56.00, 43.00),
date = c("April 29th, 2021", "April 29th, 2021", "April 29th, 2021", "April 29th, 2021", "April 29th, 2021"))
Second day of data:
df2 <- data.frame(ticker = c("XLU", "XLK", "XLF", "XLE", "XLP"),
buy_price = c(63.00, 69.00, 38.00, 53.00, 44.00),
sale_price = c(66.00, 77.00, 47.00, 63.00, 48.00),
close_price = c(65.00, 74.00, 39.00, 55.00, 45.00),
date = c("April 30th, 2021", "April 30th, 2021", "April 30th, 2021", "April 30th, 2021", "April 30th, 2021"))
DF Master file: rbind(df, df2)
ticker buy_price sale_price close_price date
1 XLU 62 64 63 April 29th, 2021
2 XLK 68 71 70 April 29th, 2021
3 XLF 37 42 38 April 29th, 2021
4 XLE 55 60 56 April 29th, 2021
5 XLP 41 45 43 April 29th, 2021
6 XLU 63 66 65 April 30th, 2021
7 XLK 69 77 74 April 30th, 2021
8 XLF 38 47 39 April 30th, 2021
9 XLE 53 63 55 April 30th, 2021
10 XLP 44 48 45 April 30th, 2021
I had used facet_wrap_paginate
to facet by stock ticker name, and create multiple graphs. However, I do not have the fine control over the axes and individual plots that I need when using a facet, so I must use an approach of plotting each ticker individually and compiling onto the same pages. I had used the code below:
for(i in 1:4){
rr_plot <- ggplot(rr_tsibble, aes(x = DATE, color = TREND)) +
geom_point(aes(y = BUY.TRADE), size = 1.5) +
geom_point(aes(y = SELL.TRADE), size = 1.5) +
geom_point(aes(y = PREV.CLOSE), color = "black", size = 1, shape = 1) +
ggforce::facet_wrap_paginate(~TICKER,
nrow = 2,
ncol = 4,
scales = "free_y",
page = i) +
scale_y_continuous()
print(rr_plot)
to achieve this. The original datafram has ~30 induvidual tickers with the same 30 added to the df the next day, and then 30 more. I have tried using dplyr
to group_by
and plot, although I haven't been to acheive desired results. I do not think that creating 30 plots manually with ggplot2
is very efficent, there must be a for loop that can allow for the selection of only certain tickers to then plot all of the data and use cowplot
and extraGrid
to compile all 30 generated plots. Any help or thoughts on how to accomplish this would be great! Thanks!
Generated some random data with some 30 random tickers across 4 days:
r <- function() {abs(c(rnorm(29,50,2),100000)*rnorm(1,10,1))}
tickers = sapply(1:30, function(x) toupper(paste0(sample(letters, 3), collapse = "")))
df <- data.frame(ticker = tickers,
buy_price = r(),
sale_price = r(),
close_price = r(),
date = rep("April 29th, 2021",30))
df2 <- data.frame(ticker = tickers,
buy_price = r(),
sale_price = r(),
close_price = r(),
date = rep("April 30th, 2021",30))
df3 <- data.frame(ticker = tickers,
buy_price = r(),
sale_price = r(),
close_price = r(),
date = rep("May 1st, 2021",30))
df4 <- data.frame(ticker = tickers,
buy_price = r(),
sale_price = r(),
close_price = r(),
date = rep("May 2nd, 2021",30))
rr_tsibble <- rbind(df, df2, df3, df4)
Converted date
to date format:
rr_tsibble$date = as.Date(gsub("st|th|nd","",rr_tsibble$date), "%b %d, %Y")
Add the addUnits()
function for formatting the large numbers:
addUnits <- function(n) {
labels <- ifelse(n < 1000, n, # less than thousands
ifelse(n < 1e6, paste0(round(n/1e3,3), 'k'), # in thousands
ifelse(n < 1e9, paste0(round(n/1e6,3), 'M'), # in millions
ifelse(n < 1e12, paste0(round(n/1e9), 'B'), # in billions
ifelse(n < 1e15, paste0(round(n/1e12), 'T'), # in trillions
'too big!'
)))))}
Make the list of plots:
plotlist <- list()
for (i in 1:ceiling(30/8))
{
plotlist[[i]] <- ggplot(rr_tsibble, aes(x = date)) +
geom_point(aes(y = buy_price), size = 1.5) +
geom_point(aes(y = sale_price), size = 1.5) +
geom_point(aes(y = close_price), color = "black", size = 1, shape = 1) +
scale_y_continuous(breaks = pretty_breaks(), labels = addUnits) +
ggforce::facet_wrap_paginate(~ticker,
nrow = 2,
ncol = 4,
scales = "free_y",
page = i)
}
There are 4 pages in total, each stored as an element of plotlist
list. For example, the final page is the 4th element, and looks like this:
plotlist[[4]]