I'm pretty sure that this problem has a simple solution, but I've been stuck for a while and can't seem to figure it out. Here's what I've done so far:
# import libraries
import folium
import pandas as pd
import numpy as np
import json
# import data
cases = pd.read_csv('COVID-19_Cases__Tests__and_Deaths_by_ZIP_Code.csv')
And then I rename the column I need to match a Geojson file:
cases.rename(columns = {'ZIP Code':'ZIP'}, inplace = True)
Because the data was listed by week and I simply need the most up-to-date numbers, I sorted by Zip Code to just get the max values that I was looking for:
cases_sorted = cases.groupby('ZIP')
maximums = cases_sorted.max()
So far so good. I drop a few unnecessary rows:
maximums_cleaning = maximums.drop('60666',axis = 0)
maximums_cleaned = maximums_cleaning.drop('Unknown',axis = 0)
And my dataframe looks like this: Dataframe
I then load a map:
import folium
map = folium.Map(location=[41.8781, -87.6298], default_zoom_start=15)
map
Change the column to type String:
maximums_cleaned['ZIP']=maximums_cleaned['ZIP'].astype(str)
And then I get this error:
KeyError: 'ZIP'
And then load my GeoJson file to layer over it:
# load GeoJson
map.choropleth(geo_data="Boundaries - ZIP Codes.geojson",
data=maximums_cleaned, # my dataset
columns=['ZIP', 'Case Rate - Cumulative'], # zip code is here for matching the geojson zipcode, sales price is the column that changes the color of zipcode areas
key_on='feature.properties.postalCode',
fill_color='BuPu', fill_opacity=0.7, line_opacity=0.2,
legend_name='Cases')
Again I get this error: KeyError: "None of ['ZIP'] are in the columns"
I have tried the code without converting to a string and received the same error code when loading my GeoJson file. I've also tried grouping by different columns with no success. I think the problem is that the "Zip" column is the first column and it's header is lower than the others. I think that this likely needs to be addressed for the GeoJson file to work with the data frame, but I cannot figure out how to fix it. Appreciate your input, thanks!
As you group by 'ZIP', it gets converted to the index of your Data Frame, and indexes are not columns, you got a confusion there.
One solution that could work, is copying your index to a column: