I've a simple question that I cannot solve:
I want to plot a data.frame
(a month) with factors, where sometimes levels are missing. R attributes then only the existing levels, so my plots differ if there are one, two ore more levels present.
Here an example:
library(ggplot2)
library(reshape2)
f <- factor(c("Free", "Work"))
mon <- as.data.frame(matrix(as.factor(rep(f[2], times = 8)), nrow = 4))
colnames(mon) <- c("A", "B")
mt <- t(as.matrix(rev(data.frame(as.matrix(mon))))) # change order of y
m <- melt(mt)
col <- c("azure", "orange")
ggplot(m, aes(x = Var2, y = Var1, fill = value)) +
geom_tile(colour="grey10") +
scale_fill_manual(values = col, labels = f, name = NULL) +
theme(panel.background = element_rect(fill = "white"), axis.ticks = element_blank()) +
theme(axis.title.x = element_blank(), axis.title.y = element_blank())
As one can see, I attribute the second element of 2 factors, "Work" to the elements, but it plots "Free". What is disturbing, is that the factors of mon
have only 1 level in place of 2 possible levels.
It gives another plot if I attribute several levels to the mon
:
mon <- as.data.frame(matrix(as.factor(rep(c(f[1], f[2]), times = 4)), nrow = 4))
.. and re-running the plot obove. It is also not possible to assign another level, even if it was a choice from originally 2 levels:
mon[1,1] <- f[1]
I tried a lot with levels
, relevel
, order
etc. without success. Does anyone have an idea?
Matrices can't hold factors. When you put a factor
in a matrix
, it gets coerced to character
, and the unused levels are lost. as.data.frame(matrix(...)))
is a bad habit for this (and other class conversion) reasons.
Here's a way to replicate your data transformations as near as I can follow them without losing factor levels:
f <- factor(c("Free", "Work"))
x= rep(f[2], 4)
mon <- data.frame(A = x, B = x)
str(mon)
# 'data.frame': 4 obs. of 2 variables:
# $ A: Factor w/ 2 levels "Free","Work": 2 2 2 2
# $ B: Factor w/ 2 levels "Free","Work": 2 2 2 2
## looks good
# What is y? What's the point?
#mt <- t(as.matrix(rev(data.frame(as.matrix(mon))))) # change order of y
mon$id = 1:nrow(mon)
m <- reshape2::melt(mon, id.vars = "id", factorsAsStrings = FALSE)
levels(m$value)
# [1] "Free" "Work"
## looks good
Now, when we get to plotting, specify drop = FALSE
in the scale to include unused levels in the legend. (Use the default drop = TRUE
if you don't want the unused levels showing up.) Since the levels are already there, we don't need to customize the labels
.
col <- c("azure", "orange")
ggplot(m, aes(x = id, y = variable, fill = value)) +
geom_tile(colour="grey10") +
scale_fill_manual(values = col, name = NULL, drop = FALSE) +
theme(panel.background = element_rect(fill = "white"), axis.ticks = element_blank()) +
theme(axis.title.x = element_blank(), axis.title.y = element_blank())
If you want to be extra safe with the color scale, you can add names
to the values
vector before putting it in the scale:
names(col) = levels(f)
Another way to get the data would be to not worry about the levels during transformation, and re-factor with appropriate levels at the end:
# your original code:
f <- factor(c("Free", "Work"))
mon <- as.data.frame(matrix(as.factor(rep(f[2], times = 8)), nrow = 4))
colnames(mon) <- c("A", "B")
mt <- t(as.matrix(rev(data.frame(as.matrix(mon))))) # change order of y
m <- melt(mt)
# add this at the end
m$value = factor(m$value, levels = levels(f))
# check that it looks good:
str(m$value)
# Factor w/ 2 levels "Free","Work": 2 2 2 2 2 2 2 2