Search code examples
python-3.xstringfindacronym

How to find acronyms containing numbers in a string


I need to make a function that finds uppercase acronyms including some containing numbers, but I can only detect only the ones containing only letters.

An example:

s= "the EU needs to contribute part of their GDP to improve the IC3 plan"

I tried

def acronym(s):
    return re.findall(r"\b[A-Z]{2,}\b", s)
print(acronym(s))

but I only get

[EU,GDP]

What can I add or change to get

[EU,GDP,IC3]

thanks


Solution

  • This regex won't match numbers (e.g. 123):

    import re
    
    s = "the EU needs to contribute part of their GDP to improve the IC3 plan"
    
    def acronym(s):
        return re.findall(r"\b([A-Z]{2,}\d*)\b", s)
    
    print(acronym(s))
    

    Prints:

    ['EU', 'GDP', 'IC3']
    

    Regex101 link here.