Search code examples
regexpython-3.xregex-lookarounds

Splitting digits into groups of threes, from right to left using regular expressions


I have a string '1234567890' that I want split into groups of threes, starting from right to left, with the left most group ranging from one digit to 3-digits (depending on how many digits are left over)

Essentially, it's the same procedure as adding commas to a long number, except, I also want to extract the last three digits as well.

I tried using look-arounds but couldn't figure out a way to get the last three digits.

string = '1234567890'
re.compile(r'\d{1,3}(?=(?:\d{3})+$)')
re.findall(pattern, string)

['1', '234', '567']

Expected output is (I don't need commas):

 ['1', '234', '567', 789]

Solution

  • Appreciate that if we add commas from right to left, for each group of three complete digits, then we can simply do a regex replace all of three digits with those three digits followed by a comma. In the code snippet below, I reverse the numbers string, do the comma work, then reverse again to arrive at the output we want.

    string = '1234567890'
    string = re.sub(r'(?=\d{4})(\d{3})', r'\1,', string[::-1])[::-1]
    print string.split(',')
    string = '123456789'
    string = re.sub(r'(?=\d{4})(\d{3})', r'\1,', string[::-1])[::-1]
    print string.split(',')
    

    Output:

    ['1', '234', '567', '890']
    ['123', '456', '789']
    

    One part of the regex used for replacement might warrant further explanation. I added a positive lookahead (?=\d{4}) to the start of the pattern. This is there to ensure that we don't add a comma after a final group of three digits, should that occur.

    Demo here:

    Rextester