I have this data in a data frame
data = [
{'name' : 'a', 'date' : '2020-01-02', 'message' : 'there'},
{'name' : 'b', 'date' : '2020-01-01', 'message' : 'Hello'},
{'name' : 'a', 'date' : '2020-01-01', 'message' : 'Hi'},
{'name' : 'b', 'date' : '2020-01-03', 'message' : 'everyone'},
{'name' : 'c', 'date' : '2020-01-05', 'message' : 'Test'}
]
What I would like to do is group by name, then sort by date, and concatenate the message for each name so that the data looks like this
[
{'name' : 'a', 'message' : 'Hi there'},
{'name' : 'b', 'message' : 'Hello everyone'},
{'name' : 'c', 'message' : 'Test'}
]
I have already been able to group by name and sort by date (after making the string into a datetime object) using this
df.groupby(['name']).apply(lambda x: x.sort_values(['date'])
but I am not sure how you would concatenate the strings together once you have grouped and sorted the data.
Try apply
with join
df.sort_values('date').groupby('name')['message'].apply(' '.join).reset_index()
name message
0 a Hi there
1 b Hello everyone
2 c Test