I'm not sure how to correctly word what I want to do, or if this is even possible, so sorry if this sounds confusing and doesn't make sense. What I want to do basically is some way to create a separate column in a dataframe that sums consecutive values. For instance, I have this column, shown below, and I would like a separate column, which contains the sum of 0.00871 and 0.001, and then 0.000603 and 0.0123, and 0.000871 and 4.37E-05, and etc. Does this make sense? If so, is there a way I can code for this, instead of doing it manually? All help is appreciated! I would prefer to do this on Python, but any language would suffice.
d.index % 2
d["sourceValue"] + d["sourceValue"].shift()
import numpy as np
import pandas as pd
pd.DataFrame({"sourceValue": np.random.uniform(0, 1, 30)}).assign(
required=lambda d: np.where(
d.index % 2, d["sourceValue"] + d["sourceValue"].shift(), np.nan
)
)
sourceValue | required | |
---|---|---|
0 | 0.804529 | nan |
1 | 0.24858 | 1.05311 |
2 | 0.710556 | nan |
3 | 0.415231 | 1.12579 |
4 | 0.879645 | nan |
5 | 0.599331 | 1.47898 |
6 | 0.635365 | nan |
7 | 0.587647 | 1.22301 |
8 | 0.902475 | nan |
9 | 0.0398323 | 0.942307 |
10 | 0.357436 | nan |
11 | 0.469155 | 0.826591 |
12 | 0.732759 | nan |
13 | 0.721996 | 1.45476 |
14 | 0.030494 | nan |
15 | 0.0047623 | 0.0352563 |
16 | 0.01081 | nan |
17 | 0.130903 | 0.141713 |
18 | 0.643588 | nan |
19 | 0.837585 | 1.48117 |
20 | 0.63328 | nan |
21 | 0.506582 | 1.13986 |
22 | 0.855085 | nan |
23 | 0.100174 | 0.95526 |
24 | 0.146992 | nan |
25 | 0.64605 | 0.793041 |
26 | 0.460859 | nan |
27 | 0.818967 | 1.27983 |
28 | 0.809294 | nan |
29 | 0.664239 | 1.47353 |
import numpy as np
import pandas as pd
df = pd.DataFrame({f"sourceValue{i+1}": np.random.uniform(0, 1, 30) for i in range(7)})
# static version
df["targetValue1"] = np.where(df.index%2, df["sourceValue1"]+df["sourceValue1"].shift(), np.nan)
df["targetValue2"] = np.where(df.index%2, df["sourceValue2"]+df["sourceValue2"].shift(), np.nan)
df["targetValue3"] = np.where(df.index%2, df["sourceValue3"]+df["sourceValue3"].shift(), np.nan)
df["targetValue4"] = np.where(df.index%2, df["sourceValue4"]+df["sourceValue4"].shift(), np.nan)
df["targetValue5"] = np.where(df.index%2, df["sourceValue5"]+df["sourceValue5"].shift(), np.nan)
df["targetValue6"] = np.where(df.index%2, df["sourceValue6"]+df["sourceValue6"].shift(), np.nan)
df["targetValue7"] = np.where(df.index%2, df["sourceValue7"]+df["sourceValue7"].shift(), np.nan)
# dynamic version
df = df.assign(**{c.replace("source", "target"):
np.where(df.index % 2, df[c] + df[c].shift(), np.nan)
for c in df.columns if "sourceValue" in c})
df