I'm trying to turn a geojson file from URL to a dataframe (pandas). I've already read the file but, when I try to turn it into a dataframe, it's not as I expect.
!wget -q -O 'wuppertal.json' https://offenedaten-wuppertal.de/sites/default/files/Stadtbezirke_EPSG4326_JSON.json
print('Data downloaded!')
import urllib.request, json
with urllib.request.urlopen("https://offenedaten-wuppertal.de/sites/default/files/Stadtbezirke_EPSG4326_JSON.json") as url:
wuppertal_data = json.loads(url.read().decode())
print(wuppertal_data)
out: "{'type': 'FeatureCollection', 'name': 'Stadtbezirke_EPSG4326_JSON', 'features': [{'type': 'Feature', 'properties': {'NAME': 'Langerfeld-Beyenburg', 'BEZIRK': '8', 'FLAECHE': 29391400}, 'geometry': {'type': 'Polygon', 'coordinates': [[[7.2510191991, 51.2917076298], [7.2505557773, 51.292270028], [7.2500517827, 51.2927158789], [7.2494997409, 51.2930461331], [7.2490203901, 51.2932752364], [7.2486303801, 51.2934148015], [7.2485802227, 51.2934327502], [7.2485234407, 51.2934518902], [7.2480306248, 51.2936180109], [7.2474431132, 51.293759494], [7.2471658102, 51.293788208], [7.2470561109, 51.2937995666], [7.24715411, 51.2937666386],...."
neighborhoods_data = wuppertal_data['features']
out: {'geometry': {'coordinates': [[[7.2510191991, 51.2917076298],
[7.2505557773, 51.292270028],
[7.2500517827, 51.2927158789],
[7.2494997409, 51.2930461331],
[7.2490203901, 51.2932752364],
[7.2486303801, 51.2934148015],
for data in neighborhoods_data:
neighborhood_name = data['properties']['NAME']
coordinates = data['geometry']['coordinates']
neighborhoods = neighborhoods.append({'Neighborhood': neighborhood_name,
'Coordinates': coordinates}, ignore_index=True)
out : Neighborhood Coordinates
0 Langerfeld-Beyenburg [[[7.2510191991, 51.2917076298], [7.2505557773...
1 Uellendahl-Katernberg [[[7.1677144694, 51.3126516481], [7.1674618797...
2 Cronenberg [[[7.1173964686, 51.2337079198], [7.117197067,...
The problem is that in each row of my table I've a neigborhood with all the coordinates aggregate in one row.
I would like to have for each row: neighborhood / Latitude / Longitude
e.g: barmen/32,34/21,34
barmen/..
...
So duplicate the neighborhood
If you can help me Thanks!
Might be a more efficient way, but this does the trick:
import urllib.request, json
import pandas as pd
with urllib.request.urlopen("https://offenedaten-wuppertal.de/sites/default/files/Stadtbezirke_EPSG4326_JSON.json") as url:
wuppertal_data = json.loads(url.read().decode())
neighborhoods_data = wuppertal_data['features']
results = pd.DataFrame()
for data in neighborhoods_data:
neighborhood_name = data['properties']['NAME']
temp_df = pd.DataFrame(data['geometry']['coordinates'])
temp_df = temp_df.T
temp_df = pd.DataFrame(temp_df.iloc[:,0].tolist(), columns=['Latitude', 'Longitude'])
temp_df['Neighborhood'] = neighborhood_name
results = results.append(temp_df).reset_index(drop=True)