Search code examples
pythonregexpython-re

Regular Expression pattern in


I have a string full of metadata and want to retrieve just some unique codas with a pattern: A letter + 4 numbers i.e: R0001, D0453, L0465

I'm currently querying this with:

re.findall(r'\bD[0-9999]*', test_data6)

And I change the letter for all the alphabet and run the script. Is there a way that it can find all that specific patterns easily?

I tried: re.findall(r'\b[A-Z]+[0-9999]*', test_data6) but doesn't get quite what I need


Solution

  • You can do it the following way:

    import re
    s = "D5678 G562 HHJ2 HZ981112"
    re.findall(r"[A-Z][0-9]{4}", s)  # ['D5678', 'Z9811']
    

    [A-Z] will match any upper case letter and [0-9]{4} will match any string of 4 numbers from 0 to 9 following each other.