I am working with this csv file. I am trying to calculate the distance the car has travelled in the 700 seconds it has recorded. The distance should be the area below the graph as (m/s) * (s) should be meters.
This is my code:
import csv
import pprint
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
from numpy import trapz
df = pd.read_csv("AutoRitData.csv")
new = df.filter(['timestamp','speed'], axis=1)
new_array = np.concatenate( new.values, axis=0 )
print(new_array)
area = trapz(new_array, dx=1)
print("area =", area)
df.plot(x='timestamp', y='speed')
plt.show()
# print(df.columns)
I am confused why the result it different for different dx values. In my eyes making more trapiods (smaller dx) should make the result more accurate, not smaller. Or is de dx not the width of the trapoids?
Also, I would like to change the color of the line where the values of curve is above 13.9 (which is 50 km/h).
I hope someone who is familiar with scientific/physics programming can help me out.
The outcome graph looks like this:
If you see the documentation on numpy.trapz
https://docs.scipy.org/doc/numpy/reference/generated/numpy.trapz.html you will notice, that dx =1 is the default - and you can have any scalar
Best accuracy, is to do
import numpy as np
dx = np.diff(new['timestamp'])
if your timedeltas are changing and in seconds this should be enough
In fact, dx should be the units of your time, i.e if you are integrating km/h, then dx = 3600 if you plan to multiply by seconds (700).
To answer your question dx is
INTEGRAL(Velocity * dx)
It is dx of the trapezoid --- but your data is timeresolved in 1 second timesteps, so you cannot arbitrarily set dx. If you had 0.5 sec data you could have done dx=0.5
****EDIT****
import pandas as pd
import numpy as np
Df = pd.read_csv('AutoRitData.csv')
Distance1 = np.trapz(Df['speed'],dx=1)
Distance2 = np.trapz(Df['speed'],dx=0.5)
Distance3 = np.trapz(Df['speed'],dx=np.diff(Df['timestamp']))
>>> Distance1 = 10850.064
>>> Distance2 = 5425.03
>>> Distance3 = 10850.064
Its clear that Distance3 and Distance1 are correct answers, since your data is not avaialble at dx=0.5, ie. half second resolution.