I'm trying to replicate approximately a map like this.
It depicts a small number of items (schools) spread across an area. For input I have the map of areas with a number for each of them. I would like to lay that out into a that many points around the area. It would be even better if they wouldn't diffuse across area boundaries, but simply distributing them is enough. Some nice repel points within area might work.
Beeswarm plots do something quite similar, could this be done on a map. Bonus question - in fact I've been looking to animate this, but can only think of very complicated ways to do this, so that new points are added as sum nrs increase in time.
The code below places the points in centroids on the map, and takes the number as a size. (I was unable to export the map properly as a single file, so coordinates are a bit messed up, but principle is the same.)
places = st_read("https://gist.githubusercontent.com/peeter-t2/9646a4169e993948fa97f6f503a0688b/raw/cb4e910bf153e51e3727dc9d1c73dd9ef86d2556/kih1897m.geojson", stringsAsFactors = FALSE)
schools <- read_tsv("https://gist.github.com/peeter-t2/34467636b3c1017e89f33284d7907b42/raw/6ea7dd6c005ef8577b36f5e84338afcb6c76b707/school_nums.tsv")
schools_geo <- merge(places,schools,by.x="KIHELKOND",by.y="Kihelkond") #94 matches
p<- schools_geo %>%
ggplot()+
geom_sf(data=schools_geo)+
geom_sf(data=st_centroid(schools_geo),aes(size=value))+
theme_bw()
p
Thanks!
As I noted in my comments, the when I read in the file it is setting the crs
to lat/lon (epsg: 4326
) while the geometry column is a different crs
. I have guessed that the correct crs
is espg: 3301
and proceeded on that basis which seems to work fine.
st_crs(schools_geo) <- 3301
We can use st_sample to get a sample of points within the polygons in relation to our 'value' column:
# we can set type = 'hexagonal', 'regular' or 'random'
school_pts <- schools_geo %>% st_sample(size = .$value, type = 'hexagonal')
schools_geo %>%
ggplot()+
geom_sf()+
geom_sf(data=school_pts, size = .8)+
theme_bw()
This produces the following plot which I think looks messy due to the fact st_sample
spreads the points out to the extents of the polygons.
It might look nicer to have the points more centered in each of the polygons like in the example you posted. To do that we could rescale the polygons depending on the number of points we want to plot within them. In the code below, I shrink the polygons by 90% if they have the least points inside (1) and by 20% if they have the most points (27).
# put values on scale between 0 and 1
scale_fact <- (max(schools_geo$value) - schools_geo$value) / (max(schools_geo$value) - min(schools_geo$value))
# re-scale between 0.2 and 0.9
scale_fact <- scale_fact * (0.9 - 0.2) + 0.2
# reverse the scale
scale_fact <- max(scale_fact) + min(scale_fact) - scale_fact
# apply the scale factor
schools_centroid <- st_geometry(st_centroid(schools_geo))
schools_geo_rescaled <- (st_geometry(schools_geo) - schools_centroid) * scale_fact + schools_centroid
school_pts <- schools_geo_rescaled %>%
st_sf(crs = 3301) %>%
bind_cols(value = schools_geo$value) %>%
st_sample(size = .$value, type = 'hexagonal')
# plot
schools_geo %>%
ggplot()+
geom_sf()+
geom_sf(data=school_pts, size = .8)+
theme_bw()