I have a dataset like this:
You can reproduce this dataframe loading the following dict:
{'entity_ts': {0: '2022-11-01T00:00:56.000Z', 1: '2022-11-01T00:00:56.000Z'}, 'entity_id': {0: 'WAZE.jams.1172133072', 1: 'WAZE.jams.1284082818'}, 'street': {0: 'Ac. Monsanto, Benfica', 1: nan}, 'position': {0: {'type': 'GeometryCollection', 'geometries': [{'coordinates': [[[-9.168816, 38.741779], [-9.169618, 38.741353], [-9.16976, 38.741289]]], 'type': 'MultiLineString'}]}, 1: {'type': 'GeometryCollection', 'geometries': [{'coordinates': [[[-9.16116, 38.774899], [-9.16083, 38.774697]]], 'type': 'MultiLineString'}]}}, 'level': {0: 5, 1: 5}, 'length': {0: 99, 1: 36}, 'delay': {0: -1, 1: -1}, 'speed': {0: 0.0, 1: 0.0}}
My problem is:
I could not load this data using geopandas properly. On geopandas it is important to specify the geometry column, but this 'position' column format is quit new to me.
1 - I've tried to load using pandas dataframe
data=`pd.read_csv(data.csv, converters={'position': json.loads})`
2 - Then I've converted to GeoDataFrame:
import geopandas as gpd
import contextily as ctx
crs={'init':'epsg:4326'}
gdf = gpd.GeoDataFrame(
data, geometry=data['position'], crs=crs)
But I got this error:
TypeError: Input must be valid geometry objects: {'type': 'GeometryCollection', 'geometries': [{'coordinates': [[[-9.168816, 38.741779], [-9.169618, 38.741353], [-9.16976, 38.741289]]], 'type': 'MultiLineString'}]}
One option is to use str
to grab the geometry infos then ask for the shape
/geoms
:
from shapely.geometry import shape
gdf = gpd.GeoDataFrame(data, geometry=data.pop("position").str["geometries"]
.explode().apply(lambda x: shape(x).geoms[0]), crs=crs)
Ouptut :
print(gdf)
entity_ts entity_id street level length delay speed geometry
0 2022-11-01T00:00:56.000Z WAZE.jams.1172133072 Ac. Monsanto, Benfica 5 99 -1 0.0 LINESTRING (-9.16882 38.74178, -9.16962 38.74135, -9.16976 38.74129)
1 2022-11-01T00:00:56.000Z WAZE.jams.1284082818 NaN 5 36 -1 0.0 LINESTRING (-9.16116 38.77490, -9.16083 38.77470)
# <class 'geopandas.geodataframe.GeoDataFrame'>