Search code examples
pythonregexpython-re

How to make sure that in a string last 2 characters have at least one digit with python regex


I'm trying to parse some specific text for like below. I have tried to use python re with r'[A-Z]{5}[A-Z0-9]{2}' expression but this is giving me unwanted text also. Please see below for the expected output.

Conditions:

  1. The string should be a total of 7 Characters.
  2. The string last 2 characters should be only 2 Digits or have at least 1 Digit with [A-Z] letters

Given String: "DHKGNC1, DHDHK32, DHKGN1K, SOME, GARBAGE, TEXT"

Expected output: ['DHKGNC1', 'DHDHK32', 'DHKGN1K']

Actual output: ['DHKGNC1', 'DHDHK32', 'DHKGN1K', 'GARBAGE']


Solution

  • Don't use [A-Z0-9]{2}, use ([A-Z0-9][0-9])|([0-9][A-Z0-9])

    That is, one or the other has to be a digit.

    re.findall(r'([A-Z]{5}(?:(?:[A-Z0-9][0-9])|(?:[0-9][A-Z0-9])))', "DHKGNC1, DHDHK32, DHKGN1K, SOME, GARBAGE, TEXT")
    ['DHKGNC1', 'DHDHK32', 'DHKGN1K']