Search code examples
pythonpython-re

Python re: How to match any numbers even those with comma & decimal?


I want to match any numbers that could have decimal, comma or simply whole number. I tried the below, but my regular expression can't match if the number have > 2 comma.

Thank you

import re

string1= "6,111,123,999 5,450,900 10.32 OCT21  Dec 31, 2019"

num=re.findall(r'\b\d+[.,]*\d+[,]*d*\b', string1)

Result:

['6,111,123', '999', '5,450,900', '10.32', '31', '2019']

Desired Outcome --> ['6,111,123,999', '5,450,900', '10.32', '31', '2019']


Solution

  • matching all numbers

    You could use \d(?:[\d,.]*\d+)?

    string1= "6,111,123,999 5,450,900 10.32 OCT21  Dec 31, 2019 1"
    
    import re
    re.findall(r'\d(?:[\d,.]*\d+)?', string1)
    

    output: ['6,111,123,999', '5,450,900', '10.32', '21', '31', '2019', '1']

    matching only numbers that are independent words

    Use \b[\d,.]*\d+\b:

    string1= "6,111,123,999 5,450,900 10.32 OCT21  Dec 31, 2019 1"
    
    import re
    re.findall(r'\b[\d,.]*\d+\b', string1)
    

    output: ['6,111,123,999', '5,450,900', '10.32', '31', '2019', '1']

    edit: matching only space, end of string, or comma as separator
    string1= "6,111,123,999 5,450,900 10.32 1a2 1-2 OCT21  Dec 31, 2019 1"
    
    import re
    re.findall(r'(?:(?<=^)|(?<=\s))[\d,.]*\d+(?=$|\s|,)', string1)
    

    output: ['6,111,123,999', '5,450,900', '10.32', '31', '2019', '1']