I have a list of points that I want to overlay on a map of San Francisco using ggplot2. Each point is a longitude, latitude pair. I want the resulting map to be in a longitude/latitude coordinate system. I managed to reproduce Hadley Wickham's directions for plotting polygon shapefiles using his example file. I am using R 2.15.1 for Windows.
However, I tried to use cdp files downloaded from the UScensus2010cdp package. Here's my code snippet:
require("rgdal")
require("maptools")
require("ggplot2")
require("sp")
require("plyr")
gpclibPermit() # required for fortify method
require(UScensus2010)
require(UScensus2010cdp)
data(california.cdp10)
sf <- city(name = "san francisco", state="ca")
sf.points = fortify(sf)
I get the following error:
Using name to define regions.
Error in unionSpatialPolygons(cp, invert(polys)) : input lengths differ
In addition: Warning message:
In split(as.numeric(row.names(attr)), addNA(attr[, region], TRUE)) :
NAs introduced by coercion
Does anybody know:
UScensus2010cdp
, so am using UScensus2000cpd
which replicates the error.The issue arises from the fact that fortify.SpatialPolygonsDataFrame
relies on converting the row.names
to numeric, and the rownames of your data are the identifiers.
ggplot2:::fortify.SpatialPolygonsDataFrame
function (model, data, region = NULL, ...)
{
attr <- as.data.frame(model)
if (is.null(region)) {
region <- names(attr)[1]
message("Using ", region, " to define regions.")
}
polys <- split(as.numeric(row.names(attr)), addNA(attr[,
region], TRUE))
cp <- polygons(model)
try_require(c("gpclib", "maptools"))
unioned <- unionSpatialPolygons(cp, invert(polys))
coords <- fortify(unioned)
coords$order <- 1:nrow(coords)
coords
}
In your case
row.names(sf@data)
## [1] "california_586" "california_590" "california_616"
are the identifiers you wish to use as the region parameters, as place
state
and name
do not uniquely identify the three polygons.
# as.character used to coerce from factor
lapply(lapply(sf@data[,c('place','state','name')], unique), as.character)
## $place
## [1] "67000"
##
## $state
## [1] "06"
##
## $name
## [1] "San Francisco"
As a character vector where the elements begin with alphabetic characters, when coerced to numeric, it becomes NA
as.numeric(rownames(sf@data))
## [1] NA NA NA
## Warning message:
## NAs introduced by coercion
Which is one of the warnings given
NULL
or 1:nrow(sf@data)
So..
# rownames
sf@data[['place_id']] <- rownames(sf@data)
row.names(sf@data) <- NULL
# fortify
sf_ggplot <- fortify(sf, region = 'place_id')
# merge to add the original data
sf_ggplot_all <- merge(sf_ggplot, sf@data, by.x = 'id', by.y = 'place_id')
# very basic and uninteresting plot
ggplot(sf_ggplot_all,aes(x=long,y=lat, group = group)) +
geom_polygon(aes(fill =pop2000)) +
coord_map()