Search code examples
rgeospatialshapefilergdal

Read only data slot from shapefile (R)?


I have some very large shapefiles. I can read them into SpatialPolygonsDataFrame's using the rgdal function readOGR, but it takes a very long time for each file. I am actually only interested in the data.frame that shows up in the @data slot. Is there any way to read just the data, skipping the resource intensive polygons?

Example code:

## State of Alabama census blocks (152 MB compressed, 266 MB uncompressed)
shpurl <- "http://www2.census.gov/geo/tiger/TIGER2011/TABBLOCK/tl_2011_01_tabblock.zip"
tmp    <- tempfile(fileext=".zip")
download.file(shpurl, destfile=tmp)
unzip(tmp, exdir=getwd())

## Read shapefile
nm  <- strsplit(basename(shpurl), "\\.")[[1]][1]
lyr <- readOGR(dsn=getwd(), layer=nm)

## Data I want
head(lyr@data)

Solution

  • Shapefiles are compound files that store their attribute data in a file with extension *.dbf. (See the Wikipedia shapefile article for a reference.) The dbf suffix refers to the dBase file format, which can be read by the function read.dbf() in the foreign package.

    So, try this:

    library(foreign)
    df <- read.dbf("tl_2011_01_tabblock.dbf")
    ## And, more generally, read.dbf("path/to/shapefile/shapefile-name.dbf")