Search code examples
sqlgoogle-bigqueryspatial-query

Using GPS coordinates to find Chicago cross streets


I am new to SQL and am using Bigquery. I have a table that has coordinates of longitude and latitude coordinates for bike stations in Chicago. I am trying to match these coordinates to cross streets. The coursera dataset that was provided to me has cross streets for many of the coordinates, but many of them have NULL for cross street names.

I have looked up my question and the closest thing I have found is that you can look to see if the gps point is within a certain geometrical area.

I just want to know if what I am looking to do is possible. If it is a little guidance on the next steps would be great.

This is the query I am starting with:

SELECT start_station_name, ST_GEOGPOINT(start_lng, start_lat) AS point FROM coursera-practice-356200.Cyclistic_data.Chi_metered WHERE start_station_name IS NULL

I was able to translate the gps locations as a point, but I am lost as what to do next.


Solution

  • First, you need another table with roads. One public dataset you can use is Open Street Maps, it is available in BigQuery public datasets project as bigquery-public-data.geo_openstreetmap.planet_ways.

    Second, you need to join your table with this table, on some "close-enough" condition - a station being within some number of meters close to the road. This is tricky - if you choose too large threshold, you get a lot of unrelated roads, if you choose too small, you miss the road you want. I chose 50m in the example below. The expression of two geometries being within some distance of each other is ST_DWithin(geo1, geo2, distance).

    I don't have access to your table, so I used bigquery-public-data.new_york_citibike.citibike_stations which has citibike data too.

    Here I build a list of roads within 50 m of each station:

    with stations as (
      SELECT name AS station_name, ST_GeogPoint(longitude, latitude) AS loc
      FROM `bigquery-public-data.new_york_citibike.citibike_stations`
    ), roads as (
      SELECT geometry, 
        (select value from w.all_tags where key = 'name') as road_name
          FROM `bigquery-public-data.geo_openstreetmap.planet_ways` w
      where geometry is not null
        and exists(select 1 from w.all_tags where key = 'highway')
    )
    select 
      station_name, 
      array_agg(distinct road_name) as nearby_roads
    from stations, roads
    where st_dwithin(loc, geometry, 50)
      and road_name is not null
    group by station_name
    

    The result seems like a good approximation, although there are some unrelated nearby streets as well:

    station_name                nearby_roads
    Delancey St & Eldridge St   "[Delancey Street,Eldridge Street]"
    Brook Ave & E 157 St        "[Brook Avenue,East 157th Street,East 158th Street]"
    Ryer Ave & E 182 St         "[Ryer Avenue,East 182nd Street]"
    Ditmars Blvd & 19 St        "[Ditmars Boulevard,19th Street]"
    ...