Could you please tell me how I can produce the graph as shown? I want to select only the top 2 neighbourhoods for each city (top 2 neighbourhoods based on the median housing prices) and show their median prices. Of course, it is much nicer if the bars are of different colors.. (Please note that I manually produce the median price and plot it in Excel, so they are not representative of the real values)
glimpse(CityNeighbourhoodPrice)
Observations: 37,245
Variables: 3
$ City <fct> Amsterdam, Amsterdam, Amsterdam...
$ Neighbourhood <fct> A,B,C,D,E,F,G,H,I,J,K...
$ Price <int> 970, 1320, 2060, 2480, 1070, 12...
Here is my code so far (that doesn't work):
CityNeighbourhoodPrice %>%
group_by(Neighbourhood) %>%
count(n) %>%
top_n(2, MedPrice) %>%
summarise(MedPrice = median(Price, na.rm = TRUE)) %>%
ggplot(aes(x = reorder(Neighbourhood,-MedPrice), y = MedPrice)) +
geom_col(fill = "tomato3", width = 0.5)+
labs(title="Ordered Bar Chart",
subtitle="Average Price by each Property Type",
caption="Image: 5") +
theme(axis.text.x = element_text(angle=65, vjust=0.6))
Using some random example data, try this:
# Example data
set.seed(42)
CityNeighbourhoodPrice <- data.frame(
City = rep(c("Amsterdam", "Berlin", "Edinburgh"), each = 30),
Neighbourhood = rep(sample(LETTERS[1:5], 30, replace = TRUE), 3),
Price = 3000 * runif(3 * 30)
)
library(ggplot2)
library(dplyr)
library(forcats)
# Plot
CityNeighbourhoodPrice %>%
group_by(City, Neighbourhood) %>%
summarise(MedPrice = median(Price, na.rm = TRUE)) %>%
top_n(2, MedPrice) %>%
ungroup() %>%
arrange(City, MedPrice) %>%
mutate(City_Neighbourhood = paste0(Neighbourhood, "\n", City),
City_Neighbourhood = forcats::fct_inorder(City_Neighbourhood)) %>%
ggplot(aes(x = City_Neighbourhood, y = MedPrice)) +
geom_col(fill = "tomato3", width = 0.5)+
labs(title="Ordered Bar Chart",
subtitle="Average Price by each Property Type",
caption="Image: 5") +
theme(axis.text.x = element_text(angle=65, vjust=0.6))
Created on 2020-04-20 by the reprex package (v0.3.0)