Search code examples
pythonpandasdataframeconcatenationdrop

Concat and drop columns function


I normally use the common python codes to concat a few columns and drop the rest. But when I try to create it in a function (def), it doesn’t work. And I am pretty sure that I’m not doing it in the correct way. Can anyone support with this please?

def prepocessor_concat():
    df[‘1’] = df[‘2’] + df[‘3’] + [‘4’]
    df.drop([‘2’, ‘3’, ‘4’,],    axis=1, inplace=True

Solution

  • You mentioned "concatenation", but your intent seems to be focused on "addition". This function demonstrates both operations.

    df_example = pd.DataFrame([
            {"2":  6,  "3":  7,  "4":  8,  "5":  9},
            {"2": 16,  "3": 17,  "4": 18,  "5": 19},
    ])
    
    def preprocessor_concat(df):
        print(pd.concat([df["2"], df["3"], df["4"]], axis="rows"))
        print("\n")
        df["1"] = df_example.pop("2") + df_example.pop("3") + df_example.pop("4")
        print(df)
    
    preprocessor_concat(df_example)
    

    output

    0     6
    1    16
    0     7
    1    17
    0     8
    1    18
    dtype: int64
    
        5   1
    0   9  21
    1  19  51
    

    Notice that the shape of the row concatenation result is different from the original dataframe shape. You will need compatible shapes across all columns. Or do what we see above: produce a brand new dataframe.