Search code examples
pythonpandastruncatemaxlength

Python Pandas Truncate a column to a specific length without cutting the last word


I try to reduce the length of a column to 50 and I use this lambda fct:

df['col_1'] = df['col_1'].apply(lambda x: x[:50])

It is working just fine excepting the fact that its cutting the last word, I need a solution that will remove that last word even if the length will decrease with a few characters.

Thank you for any advice on this


Solution

  • Truncate to up to length 50 characters and cut off the last word:

    df['col_1'] = df['col_1'].apply(lambda x: ' '.join(x[:50].split(' ')[:-1]) if len(x) > 50 else x)
    

    Note that going the other way around (first cutting off and only then truncating) may and will result in half-words at the end of the sentence.

    How does the lambda expression work?

    1. It is given x, a current sentence to work on
    2. It checks if sentence length is more than 50 chars
      2.1. If it is, it first truncates to 50 chars then it removes the last word
      2.2. Else, the sentence is less than 50 chars, the sentence remains intact