I want the overlapping part of a histogram to show the bar divided by a diagonal line, for example, I created the following code:
set.seed(1)
grupo1 <- round(rnorm(100, mean = 20, sd = 2.2))
grupo2 <- round(rnorm(100, mean = 10, sd = 2))
df <- data.frame(
valores = c(grupo1, grupo2),
grupo = c(rep("grupo1", length(grupo1)), rep("grupo2", length(grupo2)))
)
# Crear histograma
ggplot(df, aes(x = valores, fill = grupo)) +
geom_histogram(binwidth = 1, color = "black", position = "identity", alpha=0.6) +
labs(x = "Valores", y = "Frecuencia", fill = "Grupo") +
scale_fill_manual(values = c("grupo1" = "blue", "grupo2" = "red")) +
theme_minimal()
This code results in this plot: result, note that the part that overlaps is a different color
But I want the plot to be drawn like this (I modified it in paint): what I want, note that shows both colors Does anyone know how to draw the histogram as the last image shows?
A second example, following the same code:
df2 <- rbind(df, data.frame(valores = c(15,15), grupo = c("grupo1", "grupo1")))
This is the result: 2° example But i want this (modified in paint): what i want
I have tried some 'geom_histogram' arguments such as changing the 'position' but does not work I expect a code that solves my issue, Thanks in advance
Ggplot2 doesn't work well with this kind of thing. This question is about adding textures to bar plots, something similar to what you want, and it comes at great difficulty.
I've come up with some really weird solutions, there might be a way easier approach that I don't know.
Obs: dummy data (df2
) at the end.
This is basically what you said you didn't wanted, but with an improvement: the combined bar doesn't have the sum of the heights. See, if we made:
ggplot(df2, aes(valores, fill = grupo)) +
geom_histogram(binwidth = 1, color = "black", alpha=0.6)
The bar at valores = 15
will have count = 3 + 1
, but we might prefer a count = 3
bar with another samller count = 1
bar below. We can get that using position_dodge()
, but requesting no actual dodge:
ggplot(df2, aes(valores, fill = grupo)) +
geom_histogram(position = position_dodge(0), binwidth = 1, color = "black", alpha=0.6)
But we can also add a small amount of dogdge. Using position_dodge(0.3)
:
The problem with this solution is that it adds whitespace between the bars. There might be an option to geom_histogram
that removes it, but I don't know. You can open a new question if you'd like to.
geom_area
+ geom_segment
You can build a new dataset with the values of the histogram, then you get more flexibility to customize the bars. You could do that with hist()
, but since we're using ggplot, I made individual histograms for each group, and got their data with ggplot_build()
. There might be a better way to do that, the important part is that at the end you have a dataset with the histogram values for each group.
df_area <- df2 %>%
group_split(grupo) %>% #for each group
map_dfr(function(df_group){ #apply the following function
g <- ggplot(df_group, aes(valores)) +
geom_histogram(binwidth = 1) #build a histogram
ggplot_build(g)$data[[1]] %>% #get it's data
select(c(x, xmin, xmax, y)) %>% #select these columns
mutate(grupo = unique(df_group$grupo)) %>% #and add a 'grupo' column
pivot_longer(c(xmin, xmax), values_to = "x_area") #pivot the data in order to build columns with geom_area
})
Now, we can build the area of the histogram with geom_area
, and the lines with geom_segment
. Again we use position_dodge(0)
, but this time, no whitespace!:
ggplot(df_area, aes(x_area, y, fill = grupo)) +
geom_area(position = position_dodge(0), alpha = 0.6, color = "black") +
geom_segment(aes(y = 0, yend = y, x = x_area, xend = x_area))
This might yield these weird boundaries between the area contour and the segments. Also, can't use position_dodge(k)
with k != 0
.
geom_area
+ geom_segment
with custom dataThis is the closest to what you wanted. The ideia is to change the data:
df_area2 <- df_area %>%
mutate(y = case_when(grupo == "grupo1" ~ ifelse(name == "xmin", y, 0),
grupo == "grupo2" ~ ifelse(name == "xmax", y, 0)))
In such a way to produce the inclined bars:
Then, we add that on top of the base graph of the last image:
ggplot(df_area, aes(x_area, y, fill = grupo)) +
geom_area(position = position_dodge(0), color = "black") +
geom_area(data = df_area2) + #on top of the base area, but below the lines
geom_segment(aes(y = 0, yend = y, x = x_area, xend = x_area))
Using alpha
here will make the inclined bars visible. You can pass "washed out" colors to fill
in order to match the previous tone you had.
Lastly, we can add lines at the end of the half bars:
binwidth <- 1
ggplot(df_area, aes(x_area, y, fill = grupo)) +
geom_area(position = position_dodge(0), color = "black") +
geom_area(data = df_area2) +
geom_segment(aes(y = 0, yend = y, x = x_area, xend = x_area)) +
geom_segment(aes(y = y, yend = y, x = x - 0.5*binwidth, xend = x + 0.5*binwidth))
Here, binwidth
is the same one you used to create the histogram data.
set.seed(1)
grupo1 <- round(rnorm(100, mean = 20, sd = 2.2))
grupo2 <- round(rnorm(100, mean = 10, sd = 2))
df <- data.frame(valores = c(grupo1, grupo2),
grupo = c(rep("grupo1", length(grupo1)), rep("grupo2", length(grupo2))))
df2 <- rbind(df, data.frame(valores = c(15,15), grupo = c("grupo1", "grupo1")))