Search code examples
seabornline-plot

seaborn line plots with date on the x axis


enter image description here Hi,

I am trying to recreate some of the covid-19 charts that we have seen. I am using data from the Johns Hopkins database.

The data is arranged so that the city names are in the rows and the columns are dates. A screenshot of the csv file is attached. I want to plot line graphs in seaborn that has days in the x axis and confirmed case by city in the y axis. For some reason, I am unable to re-produce the exponential curves of the death rate.

My code is:

'''loading the file'''
date_columns = list(range(12,123))
df_covid_us = pd.read_csv(covid_us_file, parse_dates=date_columns)
df_covid_us = pd.read_csv(covid_us_file)

'''slicing the columns needed. Province_State and the date columns'''
df = df_covid_us.iloc[:, np.r_[6, 12:123]]
df = df[df['Province_State']=='New York']

'''using df.melt'''
df2 =df.melt(id_vars='Province_State',var_name='Date',value_name='Deaths')

'''plotting using seaborn'''[enter image description here][2]
sns.lineplot(x='Date',y='Deaths',data=df2, ci=None)
plt.gca().xaxis.set_major_formatter(mdates.DateFormatter('%Y-%m-%d'))
plt.gca().xaxis.set_major_locator(mdates.DayLocator(interval=20))
plt.show()

enter image description here


Solution

  • With a small sample of made-up data:

    import pandas as pd, seaborn as sns
    import matplotlib.pyplot as plt, matplotlib.dates as mdates
    
    df = pd.DataFrame({'Province_State':['American Samoa','Guam','Puerto Rico'],
                       '2020-01-22':[0,1,2],
                       '2020-01-23':[2,1,0]}) 
    
    # to get dates in rows
    date_columns = [c for c in df.columns.tolist() if c.endswith('/2020')]
    df2 = df.melt(id_vars='Province_State',value_vars=date_columns,
                  var_name='Date',value_name='Deaths')
    
    # dates from string to datetime
    df2['Date'] = pd.to_datetime(df2['Date'])
    
    sns.lineplot(x='Date',y='Deaths',hue='Province_State',data=df2)
    plt.gca().xaxis.set_major_formatter(mdates.DateFormatter('%Y-%m-%d'))
    plt.gca().xaxis.set_major_locator(mdates.DayLocator(interval=1)) 
    
    plt.show()