Search code examples
phparraysstringparsingsubstring

PHP :: Parse strings, while iterating through an array of substrings?


I'm a Java developer who is struggling to write his first PHP script. FYI, I'm coding with PHP 8.1.2 on an Ubuntu machine.

My code has to open a log file, read the lines one-by-one, then extract a key substring based on the preamble of the string. For example, if the log file is:

April 01 2020 Key Information Read :: Interesting Character #1:  Kermit the Frog
April 01 2020 Key Information Read :: Interesting Character #2:  Miss Piggy
April 01 2020 Key Information Read :: Their Best Movie:  The Muppet Movie (1979)
...many more lines...

Then I need a script that reads each line and extracts:

Kermit the Frog
Miss Piggy
The Muppet Movie (1979)
...many more items...

I won't know those above values before the file is read.

This problem is very solvable. Here's my PHP code, where $str is one line of the input file:

    function parseThisStr($str){
            if( str_contains($str, "Interesting Character #1:  ") ){
                    $mySubstr = "Interesting Character #1:  ";
                    $tmpIndex = strpos( $str, $mySubstr );
                    $tmpIndex += strlen($mySubstr);
                    $str2 = substr( $str, $tmpIndex );
                    $str2 = preg_replace('~[\r\n]+~', '', $str2);   // remove newline
                    return $str2;
            }
            else if( str_contains($str, "Interesting Character #2:  ") ){
                    $mySubstr = "Interesting Character #2:  ";
                    ...copy code from above...
                    return $str2;
            else if( str_contains($str, "Their Best Movie:  ") ){
                    $mySubstr = "Their Best Movie:  ";
                    ...copy code from above...
                    return $str2;
            return $str;
    }

This will work... but its needlessly repetitive, right? For each substring I am checking, I need to copy five identical lines of code. There are about 30 substrings I need to search for; this will make my code about 150 lines longer than it needs to be.

There's got to be a way to do this with more intelligence, right? Can't I store every to-be-searched substring in an array, maybe like this:

$array = array(
    1    => "Interesting Character #1:  ",
    2    => "Interesting Character #2:  ",
    3    => "Their Best Movie:  ",
    ...etc...
);

...and then iterate through the array, maybe like this:

    function parseThisStr($str){
            $array = array(
                  1    => "Kermit the Frog",
                  ...etc...
            };
            foreach( $array as &$value ){
                if( str_contains($str, $value) ){
                        $tmpIndex = strpos( $str, $value );
                        $tmpIndex += strlen($value);
                        $str2 = substr( $str, $tmpIndex );
                        $str2 = preg_replace('~[\r\n]+~', '', $str2);   // remove newline
                        return $str2;
                }
            return null;
            }

Conceptually, this should work... but I can't figure out the correct syntax. PHP syntax is confusing to me, sadly. Does anyone see where I'm going wrong? Thank you.

EDIT: I screwed up the values of $array in my first posting. $array should have the substrings that I'll use to search the larger string.


Solution

  • Using regex'es will produce more clear code. For example with preg_match:

    $line = 'April 01 2020 Key Information Read :: Interesting Character #1:  Kermit the Frog';
    $searchTerms = ["Kermit the Frog","Miss Piggy","The Muppet Movie (1979)"];
    
    // prepare regex with named group from terms
    $delimiter = '~';
    $regex = $delimiter . '(?<phrase>(' . join('|', array_map(fn($term) => preg_quote($term, $delimiter), $searchTerms)) . '))' . $delimite;
    
    // search by regex
    preg_match($regex, $line, $matches);
    $foundPhrase = $matches['phrase'] ?? null;