My test.csv contains many NaNs:
"Time","Y1","Y2","Y3"
"s","celsius","celsius","celsius"
"0.193","","",""
"0.697","","1",""
"1.074","","","-27"
"1.579","10","",""
"2.083","","5",""
"3.123","15","","-28"
"5.003","","",""
When I try to fill the missing data between valid points, using interpolate, it fills it with whole integers:
import pandas as pd
df = pd.read_csv("test.csv")
df.loc[1:, "Y3"] = pd.to_numeric(df.loc[1:, "Y3"])
df.loc[1:, "Y3"] = df.loc[1:, "Y3"].interpolate(method='linear').ffill() #method='time' , method='index'
>>> print (df)
Time Y1 Y2 Y3
0 s celsius celsius celsius
1 0.193 NaN NaN NaN
2 0.697 NaN 1 NaN
3 1.074 NaN NaN -27
4 1.579 10 NaN -27 <<-----
5 2.083 NaN 5 -27 <<-----
6 3.123 15 NaN -28
7 5.003 NaN NaN -28
I can fix the Nans at the start of the columns with bfill, but how can I fill the points between -27 and -28 with fractional values like -27.3, -27.6?
The issue is that you have strings in the first row.
df.loc[1:, "Y3"] = pd.to_numeric(df.loc[1:, "Y3"])
doesn't change the dtype to a numeric one.
You should not put headers as a row, use a MultiIndex:
df = pd.read_csv("test.csv", header=[0, 1])
Then:
df['Y3'] = df['Y3'].interpolate(method='linear').ffill()
Output:
Time Y1 Y2 Y3
s celsius celsius celsius
0 0.193 NaN NaN NaN
1 0.697 NaN 1.0 NaN
2 1.074 NaN NaN -27.000000
3 1.579 10.0 NaN -27.333333
4 2.083 NaN 5.0 -27.666667
5 3.123 15.0 NaN -28.000000
6 5.003 NaN NaN -28.000000