lat
50.63757782
50.6375742
50.6375742
50.6374077762
50.63757782
50.6374077762
50.63757782
50.63757782
I have plotted a graph with these latitude values and noticed that there was sudden spike in the graph (outlier). I want to replace every lat value with median of last three values so that I can see a meaningful result
The output might be
lat lat_med
50.63757782 50.63757782
50.6375742 50.6375742
50.6375742 50.6375742
50.63740778 50.6375742
50.63757782 50.6375742
50.63740778 50.6375742
50.63757782 50.6375742
50.63757782 50.6375742
I have thousands of such lat values and need to solve this using a for loop. I know that the following code has errors and since I am a beginner in python, I appreciate your help in solving this.
for i in range(0,len(df['lat'])):
df['lat_med'][i]=numpy.median(numpy.array(df['lat'][i],df['lat'][i-2]))
I just realized that median calculation for three points is not serving my purpose and I need to consider five values. is there a way to change the median function for as many as values I want. Thank you for your help
def median(a, b, c):
if a > b and a > c:
return b if b > c else c
if a < b and a < c:
return b if b < c else c
return a
Just go thought second to second to last elements and put save the median out of this, previous and next element. Note that first and last elements are left as they were.
Try this:
lat = [50.63757782, 50.6375742, 50.6375742, 50.6374077762, 50.63757782, 50.6374077762, 50.63757782, 50.63757782]
# returns median value out of the three values
def median(a, b, c):
if a > b and a > c:
return b if b > c else c
if a < b and a < c:
return b if b < c else c
return a
# add the first element
filtered = [lat[0]]
for i in range(1, len(lat) - 1):
filtered += [median(lat[i - 1], lat[i], lat[i + 1])]
# add the last element
filtered += [lat[-1]]
print(filtered)
What you are doing is a very basic Median filter