I have a simple Neo4J database of Roads, with cities(nodes) that has latitude, longitude, id and roads with relationships GO.
For example:
MATCH (city) RETURN (city) LIMIT 5
{"latitude":"41.974556","id":"0","longitude":"-121.904167"}
{"latitude":"41.974766","id":"1","longitude":"-121.902153"}
{"latitude":"41.988075","id":"2","longitude":"-121.896790"}
{"latitude":"41.998032","id":"3","longitude":"-121.889603"}
{"latitude":"42.008739","id":"4","longitude":"-121.886681"}
and
MATCH (n1)-\[r\]-(n2) RETURN "GO", n1.id, n2.id LIMIT 4
"GO" | "n1.id" | "n2.id" |
---|---|---|
"GO" | "0" | "1" |
"GO" | "0" | "3" |
"GO" | "1" | "0" |
"GO" | "1" | "2" |
With the follow code I can create a graph with the nodes over the map:
from py2neo import Graph
import pandas as pd
import geopandas
import matplotlib.pyplot as plt
port = "7687"
user = "****"
pswd = "*****"
try:
graph = Graph('bolt://localhost:'+port, auth=(user, pswd))
print('SUCCESS: Connected to the Neo4j Database.')
except Exception as e:
print('ERROR: Could not connect to the Neo4j Database. See console for details.')
raise SystemExit(e)
df = pd.DataFrame(graph.run("MATCH (n:Road) RETURN n.id, n.latitude, n.longitude").to_table(),columns=['ID','Latitude','Longitude'])
df.head()
gdf = geopandas.GeoDataFrame(df, geometry=geopandas.points_from_xy(df.Longitude, df.Latitude))
world = geopandas.read_file(geopandas.datasets.get_path('naturalearth_lowres'))
ax = world[world.continent == 'North America'].plot(color='white', edgecolor='black')
gdf.plot(ax=ax, color='red')
plt.show()
I don't know how can graph the relationships between the nodes. Any suggestions? Thanks.
In your example of plotting using GeoPandas.GeoDataFrame.plot()
the geometry of the GeoDataFrame is point data (lat/lon coordinates from the Road
nodes stored in Neo4j). To plot the relationships using this method you'd need the geometry for lines connecting the road nodes (intersections) as well.
I just went through a similar exercise with airport and flight data, perhaps you can adapt to your data.
First, in my Cypher query to fetch data that will later be loaded into a GeoDataFrame I find every flight using the pattern (:Airport)-[:FLIGHT_TO]->(:Airport)
as I want each row in my GeoDataFrame to be a flight route between two airports. I also calculate weighted degree centrality for each airport so we can style node size relative to centrality in the plot. We also generate WKT for the airport location (POINT) as well as the flight (LINESTRING).
AIRPORT_QUERY = """
MATCH (origin:Airport)-[f:FLIGHT_TO]->(dest:Airport)
CALL {
WITH origin
MATCH (origin)-[f:FLIGHT_TO]-()
RETURN sum(f.num_flights) AS origin_centrality
}
CALL {
WITH dest
MATCH (dest)-[f:FLIGHT_TO]-()
RETURN sum(f.num_flights) AS dest_centrality
}
RETURN {
origin_wkt: "POINT (" + origin.location.longitude + " " + origin.location.latitude + ")",
origin_iata: origin.iata,
origin_city: origin.city,
origin_centrality: origin_centrality,
dest_centrality: dest_centrality,
dest_wkt: "POINT (" + dest.location.longitude + " " + dest.location.latitude + ")",
dest_iata: dest.iata,
dest_city: dest.city,
length: f.length,
num_flights: f.num_flights,
geometry: "LINESTRING (" + origin.location.longitude + " " + origin.location.latitude + "," + dest.location.longitude + " " + dest.location.latitude + ")"
}
AS airport
"""
The Neo4j Python Driver has a to_df()
method which we can use to convert the result set from our Cypher query into a Pandas DataFrame. Then when we create the Geopandas GeoDataFrame we can parse the WKT returned by the Cypher statement into Shapely geometries.
def get_airport(tx):
results = tx.run(AIRPORT_QUERY)
df = results.to_df(expand=True)
df.columns=['origin_city','origin_wkt', 'dest_city', 'dest_wkt', 'origin_centrality', 'length', 'origin_iata', 'geometry','num_flights', 'dest_centrality', 'dest_iata']
df['geometry'] = geopandas.GeoSeries.from_wkt(df['geometry'])
df['origin_wkt'] = geopandas.GeoSeries.from_wkt(df['origin_wkt'])
df['dest_wkt'] = geopandas.GeoSeries.from_wkt(df['dest_wkt'])
gdf = geopandas.GeoDataFrame(df, geometry='geometry')
return gdf
And now we're ready to plot the flights. We can dynamically set the marker size for the airports using the weighted degree centrality column for the airport.
world = geopandas.read_file(geopandas.datasets.get_path('naturalearth_lowres'))
base = world[world.name == 'United States of America'].plot(color='white', edgecolor='black')
flights_gdf = flights_gdf.set_geometry("origin_wkt")
flights_gdf.plot(ax=base, markersize='origin_centrality')
flights_gdf = flights_gdf.set_geometry("geometry")
flights_gdf.plot(ax=base, markersize=0.1, linewidth=0.01)
plt.show()
Hope that helps.