Search code examples
phpregexpcrepreg-match-all

Regex to match all lines except duplicate


I have this text:

156.48.459.20 - - [11/Aug/2019
156.48.459.20 - - [11/Aug/2019
235.145.41.12 - - [11/Aug/2019
235.145.41.12 - - [11/Aug/2019
66.23.114.251 - - [11/Aug/2019

I want to match all lines that is from the current day so i made this simple regex '/.*11\/Aug\/2019.*'.

As you can see there's two duplicated IPs in the text, i don't want to match the duplicated lines, so i searched a bit and i found this regex: (.).*\1 DEMO although this regex is kinda weird i tried to apply it in my current regex, so i did: (.*11\/Aug\/2019.*)\1, it did not worked. Could someone help?

This is my desired result:

156.48.459.20 - - [11/Aug/2019
235.145.41.12 - - [11/Aug/2019
66.23.114.251 - - [11/Aug/2019

NOTE: i'm using the function preg_match_all():

preg_match_all('/(.*11\/Aug\/2019.*)\1/', $input_lines, $output_array);

Solution

  • Is pure regex a requirement?

    You can use PHP to get uniques:

    <?php
    $input_lines = '156.48.459.20 - - [11/Aug/2019
    156.48.459.20 - - [11/Aug/2019
    235.145.41.12 - - [11/Aug/2019
    235.145.41.12 - - [11/Aug/2019
    66.23.114.251 - - [11/Aug/2019';
    
    preg_match_all( '/.*11\/Aug\/2019/m', $input_lines, $output_array );
    
    // PHP associative array abuse incoming
    // Flip the array so that the values become keys and flip it back
    // This guarantees that only uniques survive
    $output_array[ 0 ] = array_keys( array_flip( $output_array[ 0 ] ) );
    
    var_dump( $output_array );
    

    Outputs:

    array(1) {
      [0]=>
      array(3) {
        [1]=>
        string(30) "156.48.459.20 - - [11/Aug/2019"
        [3]=>
        string(30) "235.145.41.12 - - [11/Aug/2019"
        [4]=>
        string(30) "66.23.114.251 - - [11/Aug/2019"
      }
    }