Search code examples
pandasdataframegroup-byaggregatekaggle

How to connect two dataframes with one line Code


I'm Solving Titanic Kaggle Code and Doing Data analysis with pandas. The Data is in here (https://www.kaggle.com/competitions/titanic/data)

I have two Dataframe and I want to connect them With aggregate code.

train_df[['Parch', 'Survived']].groupby(['Parch'], as_index=False).count().rename(columns={'Survived':"Count"}).reset_index(drop=True)

train_df[['Parch', 'Survived']].groupby(['Parch'], as_index=False).mean().rename(columns={'Survived':"Surviving rate"}).reset_index(drop=True)

And my Conclusion to connect those codes with column direction is like that. But I think below code is very dizzy. And There is another way to simply connect these two dataframes with one line code.


tmp_df = train_df[['Parch','Survived']].groupby(['Parch'], as_index=False).agg({"Survived": ["count","mean"]}).reset_index(drop=True)

tmp_df.columns = tmp_df.columns.droplevel()
tmp_df.columns = ["Parch", "Count", "Surviving rate"]

How Can I Connect these two dataframes with column direction with one-line simple Code?


Solution

  • Your statement can be written more succinctly using Named Aggregations:

    tmp_df = (
        train_df.groupby("Parch")
        .agg(**{"Count": ("Survived", "count"), "Surviving rate": ("Survived", "mean")})
        .reset_index()
    )