Search code examples
pythonregexstringsubstringcase-sensitive

Remove specific letter and adjacent string values regex


I am looking at a list of strings that are space separated. I am trying to find parts of string that contain a lower case "n" letter that is not next to an uppercase letter, then remove the lowercase "n" and any adjacent letter/number. Example:

before = ["23n 5T R3",
"4T 3R 2+ 2-",
"-2 +3RF n3",
"Nn1 L9 3+ n",
"un2 L0 -9 e"]

I am trying to get an output as:

after = ["5T R3",
"4T 3R 2+ 2-",
"-2 +3RF",
"Nn1 O9 3+",
"L0 -9 e"]

I am not exactly sure how to go about starting this regex condition. Apologies if its a bit tough one.


Solution

  • You can use negative lookbehind to acheive it.

    Demo: https://regex101.com/r/HKi4tp/2

    Pattern: \b\S*(?<![A-Z])n\S*\b ?

    Breakdown:

    • \b\S*and \S*\b: Match any no of non-space characters at start and end of word. (Note: Based on your need, \S can be replaced with \w or [a-zA-Z0-9])
    • (?<![A-Z])n: match n not preceded by [A-Z]
    • ?: match an optional space after the word
    • In substituition, an empty string will delete it