From a dataframe like this:
data.frame(status = c("open", "close", "close", "open/close","close"),
stock = c("google", "amazon", "amazon", "yahoo", "amazon"),
newspaper = c("times", "newyork", "london", "times", "times"))
How do I need to transform the data in order to have an alluvial plot
where the two columns are stock and newspaper and the link is the frequency of status column
With the data provided, it is difficult to know exactly what has to be plotted. I suggest this approach inspired by this blog on alluvial plots by the R library ggalluvial
:
library(ggalluvial)
library(ggplot2)
library(dplyr)
df <- data.frame(status = c("open", "close", "close", "open/close", "close"),
stock = c("google", "amazon", "amazon", "yahoo", "amazon"),
newspaper = c("times", "newyork", "london", "times", "times"))
# Count the number of occurance for each alluvial
df <- df %>%
dplyr::group_by(stock, newspaper, status) %>%
summarise(n = n())
# Define the factors
df$status <- factor(df$status, levels = c("open", "open/close", "close"))
df$stock <- factor(df$stock, levels = c("google", "amazon", "yahoo"))
df$newspaper <- factor(df$newspaper, levels = c("times", "newyork", "london"))
# Plot the alluvial as in https://cran.r-project.org/web/packages/ggalluvial/vignettes/ggalluvial.html#alluvia-wide-format
ggplot2::ggplot(df, aes(y = n, axis1 = stock, axis2 = newspaper)) +
ggalluvial::geom_alluvium(aes(fill = status), width = 1/12) +
ggalluvial::geom_stratum(width = 1/12, fill = "black", color = "grey") +
ggplot2::geom_label(stat = "stratum", aes(label = after_stat(stratum))) +
ggplot2::scale_x_discrete(limits = c("stock", "newspaper"), expand = c(.05, .05)) +
ggplot2::scale_fill_brewer(type = "qual", palette = "Set1") +
ggplot2::ggtitle("Alluvial-Test")