Search code examples
pythonregexpython-re

Question on regex not performing as expected


I am trying to change the suffixes of companies such that they are all in a common pattern such as Limited, Limiteed all to LTD.

Here is my code:

re.sub(r"\s+?(CORPORATION|CORPORATE|CORPORATIO|CORPORATTION|CORPORATIF|CORPORATI|CORPORA|CORPORATN)", r" CORP", 'ABC CORPORATN')

I'm trying 'ABC CORPORATN' and it's not converting it to CORP. I can't see what the issue is. Any help would be great.

Edit: I have tried the other endings that I included in the regex and they all work except for corporatin (that I mentioned above)


Solution

  • I see that all te patterns begins with "CORPARA", so we can just go:

    import re
    print(re.sub("CORPORA\w+", "CORP", 'ABC CORPORATN'))
    

    Output:

    ABC CORP
    

    Same for the possible patterns of limited; if they all begin with "Limit", you can

    import re
    print(re.sub("Limit\w+", "LTD", 'Shoe Shop Limited.'))
    

    Output:

    Shoe Shop LTD.