Search code examples
pandaslistcsvlist-comprehensionseries

Extracting n Number of Words from Pandas Series


I have a pandas data frame with about 2000 rows I have to do some work on and I’m stuck on one column. The column consists of comma separated words. What I need to do is to trim these words so only the first 10 ones are kept and discard the rest.

What I have tried so far is to turn the series into a list, then split with a comma separator, extract the first ten items and then add back the comma. All with a bunch of for loops. My code is erratic and even this is giving me some trouble but I have learnt how to do this before. I was hoping for a more elegant solution using a lambda function or list comprehension.


Solution

  • You could try something like this, where 'a' would be the name of your column:

    df['a'] = df['a'].apply(lambda x: ",".join(x.split(",")[:10]))