I have two object series, one before and one after, grouped by name.
I generated them from the following code:
beforeseries = dfbefore.groupby('name', dropna=True)['order'].apply(list)
print(beforeseries)
afterseries = dfafter.groupby('name', dropna=True)['order'].apply(list)
print(afterseries)
Which generate the following outputs:
Beforeseries:
Name1 [first, second, third] Name2 [first, second, third] Name_n [first, second, third, fourth]
Afterseries:
Name1 [fourth, fifth] Name2 [fourth, fifth, sixth] Name_n [fifth, sixth]
I want to join them so the output looks like the following”
Name 1 [‘first second third’, ‘fourth fifth’] Name 2 [‘first second third’, ‘fourth fifth sixth’] Name 3 [‘first second third fourth’, ‘fifth sixth’]
IIUC, seems like you need to concat
both objects :
out = (
pd.concat([beforeseries, afterseries], axis=1)
.groupby(level=0, axis=1).agg(lambda x: x.to_numpy().tolist())
# .squeeze() # uncomment this chain if you need a Series
)
Another variant :
# to turn off the FutureWarning in `2.1.0`:
# DataFrame.groupby with axis=1 is deprecated.
# Do `frame.T.groupby(...)` without axis instead.
out = (
pd.concat([beforeseries, afterseries], axis=1)
.T.groupby(level=0).agg(list).T
)
Alternatively, without creating the intermediate Series :
keys = ["before", "after"]
out = (
pd.concat([dfbefore, dfafter], keys=keys, names=["when", None])
.groupby(by=["name", "when"], sort=False)["order"].agg(" ".join)
.unstack().agg(list, axis=1)
# .to_frame("order")
)
Output :
print(out)
order
name
A [v x, o q]
B [y, r]
C [w z, p s]
Used inputs :
dfbefore = pd.DataFrame({"name": list("ACABC"), "order": list("vwxyz")})
beforeseries = dfbefore.groupby("name", dropna=True)["order"].agg(" ".join)
dfafter = pd.DataFrame({"name": list("ACABC"), "order": list("opqrs")})
afterseries = dfafter.groupby("name", dropna=True)["order"].agg(" ".join)