Search code examples
pythonstringnlppython-re

Remove the part with a character and numbers connected together in a string


How to remove the part with "_" and numbers connected together in a string using Python?

For example,

Input: ['apple_3428','red_458','D30','green']

Excepted output: ['apple','red','D30','green']

Thanks!


Solution

  • I am not sure which is needed, so present few options

    Also list comp is better instead of map + lambda, also list comp is more pythonic, List comprehension vs map

    1. \d+ stand for atleast one digit
    2. \d* stand for >= 0 digit
    >>> import re
    >>> list(map(lambda x: re.sub('_\d+$', '', x), ['green_', 'green_458aaa']))
    ['green', 'greenaaa']
    >>> list(map(lambda x: re.sub('_\d*', '', x), ['green_', 'green_458aaa']))
    ['green', 'greenaaa']
    >>> list(map(lambda x: re.sub('_\d+', '', x), ['green_', 'green_458aaa']))
    ['green_', 'greenaaa']
    >>> list(map(lambda x: x.split('_', 1)[0], ['green_', 'green_458aaa']))
    ['green', 'green']