Search code examples
pythonpandasdataframeconditional-formattingmerging-data

Merging/Concat rows in Pandas, based on some specific conditions


I have checked all similar questions, but haven't found a solution to what i look for (or at least i don't get it).

The following is a part of a big DataFrame.

date venue result gf ga team opponent
2017-08-11 Home W 4 1 Arsenal Burnley
2017-08-11 Away L 1 4 Burnley Arsenal
2017-08-11 Home D 2 0 Fulham Leeds
2019-08-11 Home D 1 0 Arsenal Burnley

I would like to make it on one row, if opponent == team on the same date.

So the goal is to get the following:

date venue result gf ga team opponent gf_opponent ga_opponent
2017-08-11 Home W 4 1 Arsenal Burnley 1 4
2017-08-11 Home D 2 0 Fulham Leeds
2019-08-11 Home D 1 0 Arsenal Burnley

I want to do that for the whole DataFrame, where opponent == team on the same date.

Thank you in advance.


Solution

  • You need to switch the team and opponent columns during the merge:

    is_home = df["venue"] == "Home"
    home = df[is_home]
    away = df[~is_home]
    
    home.merge(
        away[["date", "opponent", "team", "gf", "ga"]],
        left_on=["date", "team", "opponent"],
        right_on=["date", "opponent", "team"],
        how="outer",
        suffixes=("", "_opponent"),
    ).drop(columns=["opponent_opponent", "team_opponent"])