I am working with the python empyrical-dist package to plot a CDF
of speed distribution with respect to traval mode (multi-class).
data.head()
+---+---------+----------+----------+-------+--------------+------------+
| | trip_id | distance | duration | speed | acceleration | travelmode |
+---+---------+----------+----------+-------+--------------+------------+
| 0 | 303637 | 5.92 | 0.51 | 3.20 | 0.00173 | metro |
| 1 | 303638 | 3.54 | 0.22 | 4.44 | 0.00557 | bus |
| 2 | 303642 | 4.96 | 0.20 | 6.84 | 0.00944 | car |
| 3 | 303662 | 6.53 | 0.97 | 1.86 | 0.00053 | foot |
| 4 | 303663 | 40.23 | 0.94 | 11.85 | 0.00349 | car |
+---+---------+----------+----------+-------+--------------+------------+
now what to plot the CDF
of speed
column for each mode in travelmode
. So,
from empiricaldist import Cdf
def decorate_cdf(title, x, y):
"""Labels the axes.
title: string
"""
plt.xlabel(x)
plt.ylabel(y)
plt.title(title)
for name, group in data.groupby('travelmode'):
Cdf.from_seq(group.speed).plot()
title, x, y = 'Speed by mode','speed (km/h)', 'CDF'
decorate_cdf(title,x,y)
How do I then add legend to each plot so I can tell which plot is for what mode?
Use matplotlib's pyplot.legend
command:
plt.legend(data.groupby('travelmode').groups.keys())