For my project, we created a for loop/if else to assign a color for each of the five NYC boroughs using RColorBrewer. Here was my code for the for loop for reference. school.safety is my dataset.
color_vec<- vector(mode="character",nrow(school.safety))
table(school.safety$Borough)
borough <- unique(school.safety$Borough)
k <- length(borough)
bor_colors <- brewer.pal(k, "Set1")
for ( i in seq_len(nrow(school.safety))){
borough <- school.safety[, "Borough"]
if(borough[i] == "K"){
color_vec[i] <- bor_colors[1]
} else if (borough[i] == "M") {
color_vec[i] <- bor_colors[2]
} else if (borough[i]== "Q") {
color_vec[i] <- bor_colors[3]
} else if (borough [i]== "R") {
color_vec[i] <- bor_colors[4]
} else if (borough[i] == "X") {
color_vec[i] <- bor_colors[5]
} else {
color_vec[i] <- bor_colors[6]
}}
We are now using ggplot to create a barchart for the frequency of a particular incident by borough using the colors we assigned. Here is my code for the ggplot:
ggplot(school.safety, aes(school.safety$`Scanning Type`, fill=school.safety$Borough)) +
geom_bar(mapping = aes( color=color_vec, position="dodge", stat="identity")) +
scale_fill_manual(values=c("Brooklyn"="#377EB8" ,"Manhattan"="#4DAF4A","Queens"="#984EA3","Staten Island"="#E41A1C", "Bronx"="#FF7F00")) +
xlab("Scanning Type")+
ylab("Count")
Here is what our barchart looks like now:
How can we fill in the bins with the assigned borough colors from the forloop and create a one legend for colors/boroughs. Additionally, if anyone knows how to not stack the barchart and create five seperate bins for each borough per scanning type.
Thanks so much
The color vec is not needed, we do the mapping with a named vector in scale_fill_manual
.
boroughs = unique(school.safety$Borough)
bor_colors = brewer.pal(length(boroughs), "Set1")
names(bor_colors) = boroughs
## now bor_colors is a named vector where the names are boroughs
## and the values are the colors
ggplot(school.safety, aes(x = `Scanning Type`, fill = Borough)) +
## all the aesthetics at the top is usually nice
geom_bar(position = "dodge") +
scale_fill_manual(values = borough_colors) +
## give our named vector to the values
labs(x = "Scanning Type", y = "Count", fill = "Borough")
## labels all together is nice
You should use stat = "identity"
in geom_bar
when you already have a computed y value and are mapping a y aesthetic. You don't have y =
in your aesthetic, so I'm pretty sure you don't want stat = "identity"
(though that's just a guess since you haven't shared any sample data).
If your data frame borough
column has values K, M, Q, R, X instead of the real borough names, before running the above code I would create a new borough_name
column with the names you want. One way to do that would be making a lookup table and joining:
borough_lookup = data.frame(
borough = c("K", "M", "Q", "R", "X"),
borough_name = c("Brooklyn", "Manhattan", "Queens", "Staten Island", "Bronx")
)
school.safety = merge(school.safety, borough_lookup)
If needed, run this code to create the borough_name
column and then use borough_name
instead of borough
in all of the preceding code. (Creating the bor_colors
and the plotting code.)