Search code examples

Apply pandas function to column to create multiple new columns?

How to do this in pandas:

I have a function extract_text_features on a single text column, returning multiple output columns. Specifically, the function returns 6 values.

The function works, however there doesn't seem to be any proper return type (pandas DataFrame/ numpy array/ Python list) such that the output can get correctly assigned df.ix[: ,10:16] =

So I think I need to drop back to iterating with df.iterrows(), as per this?

UPDATE: Iterating with df.iterrows() is at least 20x slower, so I surrendered and split out the function into six distinct .map(lambda ...) calls.

UPDATE 2: this question was asked back around v0.11.0, before the useability df.apply was improved or df.assign() was added in v0.16. Hence much of the question and answers are not too relevant.


  • Building off of user1827356 's answer, you can do the assignment in one pass using df.merge:

    df.merge(df.textcol.apply(lambda s: pd.Series({'feature1':s+1, 'feature2':s-1})), 
        left_index=True, right_index=True)
        textcol  feature1  feature2
    0  0.772692  1.772692 -0.227308
    1  0.857210  1.857210 -0.142790
    2  0.065639  1.065639 -0.934361
    3  0.819160  1.819160 -0.180840
    4  0.088212  1.088212 -0.911788

    EDIT: Please be aware of the huge memory consumption and low speed: !