Search code examples
regexmathgrepfractions

regex to match irreducible fractions


How can I match irreducible fractions with regex?

For example, 23/25, 3/4, 5/2, 100/101, etc.

First of all, I have no idea about the gcd-algorithm realization in regex.

Update for all of you who's answering like "You are using the wrong tool":

Yeah, guys, I'm realizing what regex is normally used for. It's okay. But that this question is weird is kind of its whole point.

Updated 2: The idea is to find a regex that could be helpful in a situation like:

$> echo "1/2" | grep -P regex
1/2
$> echo "2/4" | grep -P regex

So, the regex should be only a string, without using any scripts and variables. Only regex.

Actually, I already know some regex which match reducible fractions written in the unary number system.

$> echo "11/1111" | grep -P '^1/1+$|(11+)+\1+/\1+$'
11/1111

So the thing is to convert from decimal to unary number system in regex, but I don't know how.


Solution

  • UPDATE

    Since the poster requested a single regex that matches against strings like "36/270", but says it doesn’t matter how legible it is, that regex is:

    my $reducible_rx = qr{^(\d+)/(\d+)$(?(?{(1x$1."/".1x$2)=~m{^(?|1+/(1)|(11+)\1*/\1+)$}})|^)};
    

    But, if like me, you believe that an illegible regex is absolutely unacceptable, you will write that more legibly as:

    my $reducible_rx = qr{
      # first match a fraction:
        ^ ( \d+ ) / ( \d+ ) $
      # now for the hard part:
        (?(?{ ( 1 x $1 . "/" . 1 x $2 ) =~ m{
                    ^
                    (?|    1+      / (1)  # trivial case: GCD=1
                      |  (11+) \1* / \1+  # find the GCD
                    )
                     $
                }x
            })
              # more portable version of (*PASS)
         | ^  # more portable version of (*FAIL)
         )
    }x;
    

    You can improve maintainability by splitting out the version that matches the unary version from the one that matches the decimal version like this:

    # this one assumes unary notation
    my $unary_rx = qr{
        ^ 
        (?|   1+       / (1)
          | (11+)  \1* / \1+ 
        ) 
        $
    }x;
    
    # this one assumes decimal notation and converts internally
    my $decimal_rx = qr{
      # first match a fraction:
        ^ ( \d+ ) / ( \d+ ) $ 
      # now for the hard part:
        (?(?{( 1 x $1 . "/" . 1 x $2 ) =~ $unary_rx})
              # more portable version of (*PASS)
         | ^  # more portable version of (*FAIL) 
         )
    }x;
    

    Isn’t that much easier by separating it into two named regexes? That would now make $reducible_rx the same as $decimal_rx, but the unary version is its own thing. That’s how I would do it, but the original poster wanted a single regex, so you’d have to interpolate the nested one for that as I first present above.

    Either way, you can plug into the test harness below using:

        if ($frac =~ $reducible_rx) {
            cmp_ok($frac, "ne", reduce($i, $j), "$i/$j is $test");
        } else {
            cmp_ok($frac, "eq", reduce($i, $j), "$i/$j is $test");
        }
    

    And you will see that it is a correct regex that passes all tests, and does so moreover using a single regex, wherefore having now passed all requirements of the original question, I declare Qᴜᴏᴅ ᴇʀᴀᴛ ᴅᴇᴍᴏɴsᴛʀᴀɴᴅᴜᴍ: “Quit, enough done.” 😇

    And you’re welcome.


    The answer is to match the regex ^(?|1+/(1)|(11+)\1*/\1+)$ against the fraction once it has been converted from decimal to unary notation, at which point the greatest common factor will be found in $1 on a match; otherwise they are coprimes. If you are using Perl 5.14 or better, you can even do this in one step:

    use 5.014;
    my $reg  = qr{^(?|1+/(1)|(11+)\1*/\1+)$};
    my $frac = "36/270";  # for example
    if ($frac =~ s/(\d+)/1 x $1/reg =~ /$reg/) { 
        say "$frac can be reduced by ", length $1;
    } else {
        say "$frac is irreducible";
    }
    

    Which will correctly report that:

    36/270 can be reduced by 18
    

    (And of course, reducing by 1 means there is no longer a denominator.)

    If you wanted to have a bit of punning fun with your readers, you could even do it this way:

    use 5.014;
    my $regex = qr{^(?|1+/(1)|(11+)\1*/\1+)$};
    my $frac  = "36/270";  # for example
    if ($frac =~ s/(\d+)/"1 x $1"/regex =~ /$regex/) {
        say "$frac can be reduced by ", length $1;
    } else {
        say "$frac is irreducible";
    }
    

    Here is the code that demonstrates how to do this. Furthermore, it constructs a test suite that tests its algorithm using all (positive) numerators and denominators up to its argument, or 30 by default. To run it under a test harness, put it in a file named coprimes and do this:

    $ perl -MTest::Harness -e 'runtests("coprimes")'
    coprimes .. ok       
    All tests successful.
    Files=1, Tests=900,  1 wallclock secs ( 0.13 usr  0.02 sys +  0.33 cusr  0.02 csys =  0.50 CPU)
    Result: PASS
    

    Here is an example of its output when run without the test harness:

    $ perl coprimes 10
    1..100
    ok 1 - 1/1 is 1
    ok 2 - 1/2 is 1/2
    ok 3 - 1/3 is 1/3
    ok 4 - 1/4 is 1/4
    ok 5 - 1/5 is 1/5
    ok 6 - 1/6 is 1/6
    ok 7 - 1/7 is 1/7
    ok 8 - 1/8 is 1/8
    ok 9 - 1/9 is 1/9
    ok 10 - 1/10 is 1/10
    ok 11 - 2/1 is 2
    ok 12 - 2/2 is 1
    ok 13 - 2/3 is 2/3
    ok 14 - 2/4 is 1/2
    ok 15 - 2/5 is 2/5
    ok 16 - 2/6 is 1/3
    ok 17 - 2/7 is 2/7
    ok 18 - 2/8 is 1/4
    ok 19 - 2/9 is 2/9
    ok 20 - 2/10 is 1/5
    ok 21 - 3/1 is 3
    ok 22 - 3/2 is 3/2
    ok 23 - 3/3 is 1
    ok 24 - 3/4 is 3/4
    ok 25 - 3/5 is 3/5
    ok 26 - 3/6 is 1/2
    ok 27 - 3/7 is 3/7
    ok 28 - 3/8 is 3/8
    ok 29 - 3/9 is 1/3
    ok 30 - 3/10 is 3/10
    ok 31 - 4/1 is 4
    ok 32 - 4/2 is 2
    ok 33 - 4/3 is 4/3
    ok 34 - 4/4 is 1
    ok 35 - 4/5 is 4/5
    ok 36 - 4/6 is 2/3
    ok 37 - 4/7 is 4/7
    ok 38 - 4/8 is 1/2
    ok 39 - 4/9 is 4/9
    ok 40 - 4/10 is 2/5
    ok 41 - 5/1 is 5
    ok 42 - 5/2 is 5/2
    ok 43 - 5/3 is 5/3
    ok 44 - 5/4 is 5/4
    ok 45 - 5/5 is 1
    ok 46 - 5/6 is 5/6
    ok 47 - 5/7 is 5/7
    ok 48 - 5/8 is 5/8
    ok 49 - 5/9 is 5/9
    ok 50 - 5/10 is 1/2
    ok 51 - 6/1 is 6
    ok 52 - 6/2 is 3
    ok 53 - 6/3 is 2
    ok 54 - 6/4 is 3/2
    ok 55 - 6/5 is 6/5
    ok 56 - 6/6 is 1
    ok 57 - 6/7 is 6/7
    ok 58 - 6/8 is 3/4
    ok 59 - 6/9 is 2/3
    ok 60 - 6/10 is 3/5
    ok 61 - 7/1 is 7
    ok 62 - 7/2 is 7/2
    ok 63 - 7/3 is 7/3
    ok 64 - 7/4 is 7/4
    ok 65 - 7/5 is 7/5
    ok 66 - 7/6 is 7/6
    ok 67 - 7/7 is 1
    ok 68 - 7/8 is 7/8
    ok 69 - 7/9 is 7/9
    ok 70 - 7/10 is 7/10
    ok 71 - 8/1 is 8
    ok 72 - 8/2 is 4
    ok 73 - 8/3 is 8/3
    ok 74 - 8/4 is 2
    ok 75 - 8/5 is 8/5
    ok 76 - 8/6 is 4/3
    ok 77 - 8/7 is 8/7
    ok 78 - 8/8 is 1
    ok 79 - 8/9 is 8/9
    ok 80 - 8/10 is 4/5
    ok 81 - 9/1 is 9
    ok 82 - 9/2 is 9/2
    ok 83 - 9/3 is 3
    ok 84 - 9/4 is 9/4
    ok 85 - 9/5 is 9/5
    ok 86 - 9/6 is 3/2
    ok 87 - 9/7 is 9/7
    ok 88 - 9/8 is 9/8
    ok 89 - 9/9 is 1
    ok 90 - 9/10 is 9/10
    ok 91 - 10/1 is 10
    ok 92 - 10/2 is 5
    ok 93 - 10/3 is 10/3
    ok 94 - 10/4 is 5/2
    ok 95 - 10/5 is 2
    ok 96 - 10/6 is 5/3
    ok 97 - 10/7 is 10/7
    ok 98 - 10/8 is 5/4
    ok 99 - 10/9 is 10/9
    ok 100 - 10/10 is 1
    

    And here is the program:

    #!/usr/bin/env perl
    #
    # coprimes - test suite to use unary coprimality algorithm
    # 
    # Tom Christiansen <tchrist@perl.com>
    # Sun Apr 17 12:18:19 MDT 2011
    
    use strict;
    use warnings;
    
    my $DEFAULT = 2*3*5;
    my $max = @ARGV ? shift : $DEFAULT;
    
    use Test::More;
    plan tests => $max ** 2;
    
    my $rx = qr{
        ^
        (?|   1+       / (1)
          | (11+)  \1* / \1+
        )
        $
    }x;
    
    for my $i ( 1 .. $max ) {
        for my $j ( 1 .. $max ) {
            my $test;
            if (((1 x $i) . "/" . (1 x $j)) =~ /$rx/) {
                my $cf = length($1);
                $test = $i / $cf;
                $test .= "/" . $j/$cf unless $j/$cf == 1;
            } else {
                $test = "$i/$j";
            }
            cmp_ok($test, "eq", reduce($i, $j), "$i/$j is $test");
        }
    }
    
    sub reduce {
        my ($a, $b) = @_;
        use Math::BigRat;
        my $f = new Math::BigRat "$a/$b";
        return "$f";
    }