I have just found the function facet_grid in ggplot2, it's awesome. The question is: I have a list with 6 countries (column HC) and destination of flights all around the world. My data look like this:
HC Reason Destination freq Perc
<chr> <chr> <chr> <int> <dbl>
1 Germany Study Germany 9 0.3651116
2 Germany Work Germany 3 0.1488095
3 Germany Others Germany 3 0.4901961
4 Hungary Study Germany 105 21.4285714
5 Hungary Work Germany 118 17.6382661
6 Hungary Others Germany 24 5.0955414
7 Luxembourg Study Germany 362 31.5056571
Is there a way that in each country only show the top ten destinations and using the function facet_grid? Im trying to make a scatter plot in this way:
Geograp %>%
gather(key=Destination, value=freq, -Reason, -Qcountry) %>%
rename(HC = Qcountry) %>%
group_by(HC,Reason) %>%
mutate(Perc=freq*100/sum(freq)) %>%
ggplot(aes(x=Perc, y=reorder(Destination,Perc))) +
geom_point(size=3) +
theme_bw() +
facet_grid(HC~Reason) +
theme(panel.grid.major.x = element_blank(),
panel.grid.minor.x = element_blank(),
panel.grid.major.y = element_line(colour = "grey60", linetype = "dashed"))
Which produces this graph: I want to avoid the overplotting in the y-axis. Thanks in advance!!!
You could create a variable indicating the rank of each destination by country and then in the ggplot call select rows with ranking <= 10, e.g.
ggplot(data = mydata[rank <= 10, ], ....)
PS: Currently you create data and plot data all in one line using pipes. I would separate the data creation and plotting step.