Search code examples
pandasmultiple-columns

removing square brackets in creating a new column pandas


I have a pandas df, something like this:

     col1              col2             
     ABC       [hello, hi, hey, hiya]

my task is to extract the first three words of col2 into a new column with a hyphen in between. Something like this:

     col1              col2                 col3    
     ABC       [hello, hi, hey, hiya]    hello-hi-hey

this seemed simple enough, but I am not able to remove the square brackets anyway I try in new column. Is this possible to do? Any help will be appreciated.


Solution

  • Assuming a Series of lists, slice and join:

    df['col3'] = df['col2'].str[:3].agg('-'.join)
    

    If you rather have string representations of lists:

    import re
    df['col3'] = ['-'.join(re.split(', ', s[1:-1])[:3]) for s in df['col2']]
    

    output:

      col1                    col2          col3
    0  ABC  [hello, hi, hey, hiya]  hello-hi-hey