Search code examples
rdictionarygisleafletgtfs

How to create an interactive plot of GTFS data in R using Leaflet?


I would like to create an interactive map showing the public transport lines of a city. I am trying to do this using Leaflet in R (but I'm open to alternatives, suggestions?)

Data: The data of the transport system is in GTFS format, organized in text files (.txt), which I read into R as a data frame.*

The Problem: I cannot find how to indicate the id of each Poly line (variable shape_id) so the plot would actually follow the route of each transit line. Instead, it is connecting the dots in a random sequence.

Here is what I've tried, so far without success:

# Download GTFS data of the Victoria Regional Transit System
  tf <- tempfile() 
  td <- tempdir()
  ftp.path <- "http://www.gtfs-data-exchange.com/agency/bc-transit-victoria-regional-transit-system/latest.zip"
  download.file(ftp.path, tf) 

# Read text file to a data frame
  zipfile <- unzip( tf , exdir = td )
  shape <- read.csv(zipfile[9])

# Create base map
  basemap <- leaflet() %>% addTiles()


# Add transit layer
  basemap  %>% addPolylines(lng=shape$shape_pt_lon, lat=shape$shape_pt_lat, 
                            fill = FALSE,
                            layerId =shape$shape_id) 

I would be glad to have your comments on this.

*I know it is possible to import this data into a GIS software (e.g. QGIS) to create a shapefile and then read the shapefile into R with readOGR. Robin Lovelace has shown how to do this. BUT, I am looking for a pure R solution. ;)

ps. Kyle Walker has written a great intro to interactive maps in R using Leaflet. Unfortunately, he doesn't cover poly lines in his tutorial.


Solution

  • Your problem is not one of method but of data: note that you download 8 MB and that the line file you try to load into Leaflet via shiny is 5 MB. As a general principle, you should always try new methods with tiny datasets first, before scaling them up. This is what I do below to diagnose the problem and solve it.

    Stage 1: Explore and subset the data

    pkgs <- c("leaflet", "shiny" # packages we'll use
      , "maps" # to test antiquated 'maps' data type
      , "maptools" # to convert 'maps' data type to Spatial* data
      )
    lapply(pkgs, "library", character.only = TRUE)
    
    
    class(shape)
    ## [1] "data.frame"
    
    head(shape)
    
    ##   shape_id shape_pt_lon shape_pt_lat shape_pt_sequence
    ## 1 1-39-220    -123.4194     48.49065                 0
    ## 2 1-39-220    -123.4195     48.49083                 1
    ## 3 1-39-220    -123.4195     48.49088                 2
    ## 4 1-39-220    -123.4196     48.49123                 3
    ## 5 1-39-220    -123.4197     48.49160                 4
    ## 6 1-39-220    -123.4196     48.49209                 5
    
    object.size(shape) / 1000000 # 5 MB!!!
    
    ## 5.538232 bytes
    
    summary(shape$shape_id)
    shape$shape_id <- as.character(shape$shape_id)
    ids <- unique(shape$shape_id)
    shape_orig <- shape
    shape <- shape[shape$shape_id == ids[1],] # subset the data
    

    Stage 2: Convert to a Spatial* object

    Is this like the data.frame objects from maps?

    state.map <- map("state", plot = FALSE, fill = TRUE)
    str(state.map)
    
    ## List of 4
    ##  $ x    : num [1:15599] -87.5 -87.5 -87.5 -87.5 -87.6 ...
    ##  $ y    : num [1:15599] 30.4 30.4 30.4 30.3 30.3 ...
    ##  $ range: num [1:4] -124.7 -67 25.1 49.4
    ##  $ names: chr [1:63] "alabama" "arizona" "arkansas" "california" ...
    ##  - attr(*, "class")= chr "map"
    

    Yes, it's similar, so we can use map2Spatial* to convert it:

    shape_map <- list(x = shape$shape_pt_lon, y = shape$shape_pt_lat)
    shape_lines <- map2SpatialLines(shape_map, IDs = ids[1])
    plot(shape_lines) # success - this plots a single line!
    

    line

    Stage 3: Join all the lines together

    A for loop will do this nicely. Note we only use the first 10 lines. Use 2:length(ids) for all lines:

    for(i in 2:10){
      shape <- shape_orig[shape_orig$shape_id == ids[i],]
      shape_map <- list(x = shape$shape_pt_lon, y = shape$shape_pt_lat)
      shape_temp <- map2SpatialLines(shape_map, IDs = ids[i])
      shape_lines <- spRbind(shape_lines, shape_temp)
    }
    

    Stage 4: Plot

    Using the SpatialLines object makes the code a little shorter - this will plot the first 10 lines in this case:

    leaflet() %>% 
      addTiles() %>%
      addPolylines(data = shape_lines)
    

    first-10

    Conclusion

    You needed to play around with the data and manipulate it before converting it into a Spatial* data type for plotting, with the correct IDs. maptools::map2Spatial*, unique() and a clever for loop can solve the problem.