Search code examples
rggplot2ggmap

Labelling coloured region in ggplot map


I am having a small challenge with labelling my ggplot map.

Here is how I plot my map in ggplot, which works well:

library(ggmap)
library(scales)

Kenya1_df<- as.data.frame(Kenya1_df)

theme_opts<-list(theme(panel.grid.minor = element_blank(),
                       panel.grid.major = element_blank(),
                       panel.background = element_blank(),
                       plot.background = element_blank(),
                       axis.line = element_blank(),
                       axis.text.x = element_blank(),
                       axis.text.y = element_blank(),
                       axis.ticks = element_blank(),
                       axis.title.x = element_blank(),
                       axis.title.y = element_blank(),
                       plot.title = element_blank()))

ggplot() + 
  geom_polygon(data = Kenya1_df, aes(x = long, y = lat, group = group, fill =
                                       count), color = "black", size = 0.25) +
  theme(aspect.ratio=1)+
  scale_fill_distiller(name="Count", palette = "Reds", breaks = pretty_breaks(n = 5))+
  #geom_text(aes(label =Kenya1_df$NAME_1, x = Kenya1_df$long, y = Kenya1_df$lat))+
  labs(title="Nice Map")

The output I get is good enter image description here

Now here is where my problem starts, the moment I add labels using geom_text(aes(label =Kenya1_df$NAME_1, x = Kenya1_df$long, y = Kenya1_df$lat)) , I get a very messy map.

Here is how code looks:

ggplot() + 
  geom_polygon(data = Kenya1_df, aes(x = long, y = lat, group = group, fill =
                                       count), color = "black", size = 0.25) +
  theme(aspect.ratio=1)+
  scale_fill_distiller(name="Count", palette = "Reds", breaks = pretty_breaks(n = 5))+
  geom_text(aes(label =Kenya1_df$NAME_1, x = Kenya1_df$long, y = Kenya1_df$lat))+
  labs(title="Nice Map")

I get this messy map

enter image description here

Now , in the dataset, I have a column called count which determines which regions should be coloured. I notice that within this dataset, there is regions with NA in the count dataframe and so I am assuming during labelling, all this regions are being labelled.

Here is snippet of the data enter image description here

Let me provide this link to help you download the country data. Here is the link to download map data map data and this article will help in downloading article. So with that downloaded, we assume only 5 counties have data to plot or colour

My question, is, how can I label and focus only on row values with numbers in the count column or if there is any other smart way of labelling without messing visibility of the map output. Note that not all regions have data, I want to focus on only those with data but still have a complete map.


Solution

  • The problem is that you are plotting a name on every vertex of every polygon in your image. You need to plot a single name at the centroid of each region. I find it's easier to work with sf objects if you want to do this, but it's straightforward to convert your data frame to one:

    library(ggplot2)
    
    Kenya1_df<- sf::st_as_sf(Kenya1_df)
    
    ggplot(Kenya1_df) + 
      geom_sf(aes(fill = count)) +
      geom_sf_label(data = sf::st_centroid(subset(Kenya1_df, !is.na(count))), 
                   aes(label = NAME_1)) +
      scale_fill_distiller(name = "Count", palette = "Reds", 
                           breaks = scales::pretty_breaks(n = 5))
    

    enter image description here


    Data used

    Obviously we don't have your count data, so here's how I recreated something similar to yours:

    Kenya1_df <- raster::getData("GADM", country = "KE", level = 1)
    
    Kenya1_df$count <- NA
    
    regions <- c("Kajiado", "Kitui", "Tana River", "Garissa", "Wajir", "Mandera",
                 "Homa Bay", "Meru", "Murang'a", "Nyeri", "Kiambu", "Nairobi", 
                 "Machakos", "Nakuru", "Elgeyo-Marakwet")
    
    values <- c(150, 150, 1000, 1900, 500, 150, 150, 
                150, 150, 150, 500, 750, 500, 150, 150)
    
    Kenya1_df$count[match(regions, Kenya1_df$NAME_1)] <- values