The data frame has 3 columns, id, days, and sum. I would like to generate a heatmap of sum with id on the y axis and days on the x axis. The problem is the data is sparse, so the heatmap consists of discrete bars. I would like the bars to extend to the right so the bars are solid and change colors when the sum changes values, and keeps that color until the next day value to the right.
Here's an example to generate the type of plot that I'm making.
library(ggplot2)
set.seed(13)
x_id <- sample( LETTERS[1:5], 100, replace=TRUE,
prob=c(0.15, 0.2, 0.35, 0.1, 0.2) )
x_sum <- sample( c(5, 30, 60, 120, 180, 240, 360), 100, replace=TRUE,
prob=c(.1, .1, .2, .2, .2, .1, .1) )
x_days <- sample.int(2000, 100, replace = TRUE)-1000
df <- data.frame(id = x_id, Days = x_days, sum = x_sum)
ggp <- ggplot(data = df,
mapping = aes(x = Days,
y = id,
fill = sum)) +
geom_tile() +
xlab(label = "Days") + ylab(label = 'id') +
scale_fill_gradient(low = "blue", high = "red")
print(ggp)
I would like the colors to extend to the right. I believe this means that data frame should be sorted by id and days, and additional rows would have to be added for each id in order to fill in the missing days with value of sum and id equal to the last value of sum / id. But how do I do add the rows for each id and fill in the missing values? The right most color should be extended by a fixed length so the color is more visible, say extend by 30 days.
Also, the color map show indicate a critical value. Say the critical value is 180. Then for sums from zero to the critical value (180), then the colors should go from green (0) to yellow (179), and for values above critical value (180), colors should go from light red (180) to dark red (max value or 360)
Update:
Here is a solution for filling in the sparse matrix
library(tidyr)
setkey(DT, id, Days)
DT_fill_NA <- DT[setkey(DT[, .(min(Days):(max(Days)+30)), by = id], id, V1)]
DT_fill <- fill(DT_fill_NA, c('sum'), .direction = "down")
ggp <- ggplot(data = DT_fill,
mapping = aes(x = Days,
y = id,
fill = sum)) +
geom_tile() +
xlab(label = "Days") + ylab(label = 'id') +
scale_fill_gradient(low = "blue", high = "red")
print(ggp)
This creates the figure with the sparse bars extended to right to the next bar
Now the color map should be modified to indicate a critical value. Let the critical value be 180. Then for sums from zero to the critical value (180), then the colors should go from green (0) to yellow (179), and for sums above critical value (180), colors should go from light red (180) to dark red (max value or 360)
2nd Update
One way to generate green with a break at 180 is as follows
ggp <- ggplot(data = DT_fill,
mapping = aes(x = Days,
y = id,
fill = sum)) +
geom_tile() +
xlab(label = "Days") + ylab(label = 'id') +
scale_fill_gradient2(low = "green", mid = "indianred2", high = "red2",
midpoint = 180, breaks = c(50, 100, 200, 300)) +
theme_bw()
print(ggp)
I'm not sure this clearly identifies the break point at a specific value. How can the break between green / red be made right at a critical value (180)?
Here is one approach to generate a filled heat map from a sparse matrix with a critical value highlighted.
library(ggplot2)
library(data.table)
library(tidyr)
set.seed(13)
n_rows = 200
x_id <- sample( LETTERS[1:5], n_rows, replace=TRUE,
prob=c(0.15, 0.2, 0.35, 0.1, 0.2) )
x_sum <- sample( c(0, 5, 30, 60, 120, 180, 240, 270, 360), n_rows, replace=TRUE,
prob=c(.05, .05, .1, .2, .2, .2, .1, 05, .05) )
x_days <- sample.int(2000, n_rows, replace = TRUE)-1000
DT <- data.table(id = x_id, Days = x_days, sum = x_sum)
setkey(DT, id, Days)
DT_fill_NA <- DT[setkey(DT[, .(min(Days):(max(Days)+100)), by = id], id, V1)]
DT_fill <- fill(DT_fill_NA, c('sum'), .direction = "down")
brks = c(-1, 50, 100, 180, 250, 300, max(DT_fill$sum))
DT_fill$sum_factors = cut(DT_fill$sum, breaks = brks, ordered_result = TRUE, right = TRUE)
unique(DT_fill$sum_factors)
ggp <- ggplot(data = DT_fill,
mapping = aes(x = Days,
y = id,
fill = sum_factors)) +
geom_tile() +
xlab(label = "Days") + ylab(label = 'id') +
scale_fill_manual(values = c("green4", "green3", "green",
"firebrick1", "firebrick3", "firebrick4")) +
theme_bw()
print(ggp)