Search code examples
regexpcre

Regex to compare natural numbers


I'm learning PCRE2 regular expressions. How can they be used to compare two natural numbers of arbitrary length?

I read on Stackoverflow that for practical purposes this is inefficient. But I'm asking for educational purposes.

Need to compare two numbers given as <number1> ? <number2>. And replace ? with the comparison result: ">", "<" or "=".

Input example:

123 ? 456
90 ? 530
87234790 ? 1
15 ? 15

Expected output:

123 < 456
90 < 530
87234790 > 1
15 = 15

I've tried, but don't quite understand how to do it.

I'm new to Stackoverflow. Sorry if such educational questions are not welcome here.


Solution

  • I have sketched an approximate regex.

    Regex:

    /^(*:>)(?=(?(?=(\d(?:\ \?\ |(?1))\d))
       (?(?=\1.)(*ACCEPT:<))
       (?:(?=\d(\d*+\D++)(((?(3)\3))\d))
          (?>(?=(\d)\2\4\5) |
          (?:1\2\4[0]|2\2\4[01]|3\2\4[0-2]|4\2\4[0-3]|5\2\4[0-4]|6\2\4[0-5]|7\2\4[0-6]|8\2\4[0-7]|9)(*ACCEPT) |
          (*ACCEPT:<))
       .)++(*ACCEPT:=)
    )).*?\K\?/gmx
    

    Replacer: $*MARK

    Demo

    Some explanation

    General strategy: we should compare digits of number1 with corresponding digits of number2.

    (?(?=\1.)(*ACCEPT:<)) - number1 have less digits than number2, so number1 < number2. We don't consider numbers starting with series of "0"

    (?:(?=\d(\d*+\D++)(((?(3)\3))\d)) - start loop through the corresponding digits of given numbers

    (?=(\d)\2\4\5) - if digit1 is equal to digit2 then go to next loop iteration to check next pair of digits

    (?:1\2\4[0]|2\2\4[01]|3\2\4[0-2]|4\2\4[0-3]|5\2\4[0-4]|6\2\4[0-5]|7\2\4[0-6]|8\2\4[0-7]|9)(*ACCEPT) - digit1 > digit2, so number1 > number2, exit

    (*ACCEPT:<)) - digit1 < digit2, so number1 < number2, exit

    .)++(*ACCEPT:=) - all corresponding digits are equal, so number1 = number2