I have an excel sheet with values representing start and end_time of a time series data, as shown below. Times are in seconds.
+------------+---------+-------+
Start_Time End_Time Value
0 2 A
2 3 B
3 9 A
9 11 C
I want to extrapolate the values between start and end_time and display the values for each second.
+---------+------+
Time Value
0 A
1 A
2 A
3 B
4 A
5 A
6 A
7 A
8 A
9 A
10 C
11 c
Any help to implement it in Python will be appreciated. Thanks.
You should find how to read your excel sheet with pandas easily, and options will depend on the file itself, so I won't cover this part. Below is the reproduction of your sample dataframe, used for the example.
import pandas as pd
df = pd.DataFrame({'Start_Time': [0, 2, 3, 9],
'End_Time': [2, 3, 9, 11],
'Value': ['A', 'B', 'A', 'C']})
>>> df
Out[]:
End_Time Start_Time Value
0 2 0 A
1 3 2 B
2 9 3 A
3 11 9 C
(pd.Series(range(df.End_Time.max() + 1), name='Value') # Create a series on whole range
.map(df.set_index('End_Time').Value) # Set values from "df"
.bfill() # Backward fill NaNs values
.rename_axis('Time')) # Purely cosmetic axis rename
Out[]:
Time
0 A
1 A
2 A
3 B
4 A
5 A
6 A
7 A
8 A
9 A
10 C
11 C
Name: Value, dtype: object
Create the whole "Time" range
s = pd.Series(range(df.End_Time.max() + 1))
>>> s
Out[]:
0 0
1 1
2 2
3 3
4 4
5 5
6 6
7 7
8 8
9 9
10 10
11 11
dtype: int32
Use "End_Time" as index for df
>>> df.set_index('End_Time')
Out[]:
Start_Time Value
End_Time
2 0 A
3 2 B
9 3 A
11 9 C
Map df values to corresponding "End_Time" values from s
s = s.map(df.set_index('End_Time').Value)
>>> s
Out[]:
0 NaN
1 NaN
2 A
3 B
4 NaN
5 NaN
6 NaN
7 NaN
8 NaN
9 A
10 NaN
11 C
dtype: object
Backward-fill the NaN
values
s = s.bfill()
>>> s
Out[]:
0 A
1 A
2 A
3 B
4 A
5 A
6 A
7 A
8 A
9 A
10 C
11 C
dtype: object
Then rename_axis('Time')
only renames the series axis to match your desired output.
Note that this works here because you use excluding Start_Time
.
If you were using including Start_Time
(where Value
really starts at Start_Time
, which is more common) you should change End_Time
to Start_Time
and bfill()
to ffill()
(forward-fill).