Essentially I have a table of timestamps and some data and want to group by the same timestamps and change the timestamps on a grouping basis. I got something working with Interpolate seconds to milliseconds in dataset?
The solution seems to work fine for many rows but not for simple datasets and I can't figure out why. I've narrowed it down to a simple example below.
Data:
t val
0 0.3
0 0.2
0 0.6
0 0.4
Expected result:
t val
1 0.3
1 0.2
1 0.6
1 0.4
Code:
df = pd.DataFrame([[0, 0.3], [0, 0.2], [0, 0.6], [0, 0.4]], columns=["t", "val"])
# Group by timestamp and add +1 to each (just for demonstration)
df.t = df.groupby("t", group_keys=False).apply(lambda df: df.t + 1)
This raises ValueError: Columns must be same length as key
and I can't see what I'm doing wrong. Any help appreciated.
If need output values to new column use GroupBy.transform
with specify column after groupby
for processing:
df.t = df.groupby('t')['t'].transform(lambda x: x + 1)
Linked solution with np.linspace
should be changed:
df.t = df.groupby('t')['t'].transform(lambda x: x + np.linspace(0, 1, len(x)))
print (df)
t val
0 0.000000 0.3
1 0.333333 0.2
2 0.666667 0.6
3 1.000000 0.4
Or add counter by GroupBy.cumcount
:
df.t += df.groupby('t').cumcount()