I am new in using Python so I need a help.
I have data in two DataFrame in four columns: latitude, longitude, datetime and temperature.
In the DataFrame df2 I have latitude, longitude, datetime and I need to interpolate temperature using data from df1.
I need to use coordinate and datetime data to interpolate and I don't know how to do that.
Example of DataFrame:
df1:
lat | lon | Datetime | temp
---------------------------------------------------
15.13 | 38.52 | 2019-03-09 16:05:07 | 23
12.14 | 37.536 | 2019-03-15 09:50:07 | 22
13.215 | 39.86 | 2019-03-09 11:03:47 | 21
11.1214 | 38.536 | 2019-03-10 16:41:18 | 22
12.14 | 37.536 | 2019-03-09 06:15:27 | 19
df2:
lat | lon | Datetime
---------------------------------------------
13.13 | 38.82 | 2019-03-06 04:05:07
11.14 | 36.36152 | 2019-03-15 19:51:07
10.214 | 39.123 | 2019-03-19 11:01:08
12.14 | 37.536 | 2019-03-10 16:15:27
Which method or function I need to use?
The best way to deal with a temporal interpolation is to convert the time into total seconds from a reference point in the past. You could then interpolate all values as though they were floats.
Here are your input dataframes df1 and df2:
df1 = pd.DataFrame({'lat':[15.13,12.14,13.215,11.1214,12.14],
'lon': [38.52, 37.536,39.86,38.536,37.536],
'Datetime': pd.to_datetime(['2019-03-09 16:05:07','2019-03-15 09:50:07','2019-03-09 11:03:47','2019-03-10 16:41:18','2019-03-09 06:15:27']),
'temp':[23,22,21,22,19]})
df2 = pd.DataFrame({'lat':[13.13,11.14,10.214,12.14],
'lon': [38.82, 36.36152,39.123,37.536],
'Datetime': pd.to_datetime(['2019-03-06 04:05:07 ','2019-03-15 19:51:07','2019-03-19 11:01:08','2019-03-10 16:15:27'])})
Here is how you could convert time to floats, based on seconds from a reference point in the past:
df1['seconds'] = df1.Datetime.apply(lambda x: (pd.to_datetime(x)-pd.to_datetime('2019-03-01 00:00:00')).total_seconds())
df2['seconds'] = df2.Datetime.apply(lambda x: (pd.to_datetime(x)-pd.to_datetime('2019-03-01 00:00:00')).total_seconds())
And finally, you can use an interpolate function from scipy or any other package to interpolate using the lat, lon and seconds columns (note that some of your points in df2 fall outside the range defined in df1, and you get nans as a result):
from scipy.interpolate import griddata
griddata((df1.loc[:,['lat','lon','seconds']].values),
df1.iloc[:,3].values,
(df2.iloc[:,[0,1,3]].values))