Search code examples
phparraysregexreplacepreg-replace-callback

Add year function boolean and replace 3-letter month with its month number in an array of strings


Background information:

  • The log file is copied and read out at regular intervals.
  • Log file lines do not have a year specification.
  • Months are continuous.
  • Before January is always the previous year.

If January lines do not appear in the log file, then it is only about the current year. Example:

  • 2023 mar,
  • 2023 apr,
  • 2023 may,
  • 2023 jun

If January occurs one or more times in the month cycle, then the current year is from the beginning of the last occurrence (January). Example:

  • 2021 nov,
  • 2021 dec,
  • 2021 dec
  • 2022 jan, // new year
  • 2022 jan,
  • 2022 feb,
  • ...
  • 2022 dec,
  • 2023 jan, // new year, last jan = actual year
  • 2023 jan,
  • 2023 feb,
  • ...

Code:

Recognize which year it is based on the 3-letters and lines and add the correct year to each line.

$arr = [
    // without first Dec, Jan and Jan lines:
    // all subsequent lines are the current year
    "Dec 23 21:37:56 hello",
    "Jan 12 02:08:23 hello",
    "Jan 16 17:34:33 hello",
    "Feb  4 12:21:09 hello",
    "Mar 19 17:07:26 hello",
    "Apr  1 00:00:03 hello",
    "Apr 12 23:07:39 hello",
    "May 21 04:09:34 hello",
    "Jun  7 23:34:56 hello",
    "Jul  1 14:45:34 hello",
    "Aug 13 11:37:23 hello",
    "Sep 29 07:36:03 hello",
    "Oct 30 09:01:00 hello",
    "Nov 10 11:00:03 hello",
    "Dec 25 21:47:51 hello"
];

Create a function to find the years.

function setYear()
{
    global $arr, $y;
    $first = explode(' ', $arr[array_key_first($arr)]);
    
    // if the 1st line doesn't start with Jan, then it's the previous year.
    if (!in_array('01', $first)) {
        $y = date("Y", strtotime("-1 year"));
    } else {
        $y = date("Y");
    }
    return $y;
}

Convert date year and month integer

$arr = preg_replace_callback(
    '/^(\w+)\s+(\d+)\s/',
    function ($matches) {
        global $y;
        $yy = setYear($y);
        return date($yy . ' m d', strtotime($matches[0] . ' ' . date("Y"))) . ' ';
    },
    $arr
);

echo '<pre>';
print_r($arr);
echo '</pre>';

Unexpected result:

Array
(
    [0] => 2022 12 23 21:37:56 hello
    [1] => 2022 01 12 02:08:23 hello
    [2] => 2022 01 16 17:34:33 hello
    [3] => 2022 02 04 12:21:09 hello
    [4] => 2022 03 19 17:07:26 hello
    // ...
    [9] => 2022 11 10 11:00:03 hello
    [10] => 2022 12 25 21:47:51 hello
)

Expected result:

Array
(
    [0] => 2023 12 23 21:37:56 hello
    [1] => 2022 01 12 02:08:23 hello
    [2] => 2022 01 16 17:34:33 hello
    [3] => 2022 02 04 12:21:09 hello
    [4] => 2022 03 19 17:07:26 hello
    // ...
    [9] => 2022 11 10 11:00:03 hello
    [10] => 2022 12 25 21:47:51 hello
)

Solution

  • Use a static variable instead of bringing global variables into scope with global. The static keyword will ensure that the previous iterations' declaration is retained and is accessible. If Jan is encountered or was encountered before, set the flag as true. Until the flag is set to true, subtract 1 year from the date's year.

    Code: (Demo)

    var_export(
        preg_replace_callback(
            '/^([a-z]{3}) +\d+/i',
            function($m) {
                static $encounteredJan = false;
                $encounteredJan = $encounteredJan || $m[1] === 'Jan';
                return date('Y m d', strtotime($m[0] . ($encounteredJan ? '' : ' -1 year')));
            },
            $arr
        )
    );
    

    If you cannot rely on Jan existing in the dataset, then (assuming you never need to jump more than one year forward), just check if the current month is less than the last encountered month. If, say, going from Sep to Apr (10 to 4), then you can safely assume that the year should be increased/incremented.

    Code: (Demo)

    var_export(
        preg_replace_callback(
            '/^([a-z]{3}) +\d+/i',
            function($m) {
                static $lastMonthInt = 0;
                static $year = null;
                $year ??= date('Y', strtotime('-1 year'));
                $currentMonthInt = date('n', strtotime($m[1]));
                if ($currentMonthInt < $lastMonthInt) {
                    ++$year;
                }
                $lastMonthInt = $currentMonthInt;
                return "$year " . date('m d', strtotime($m[0]));
            },
            $arr
        )
    );
    

    Final edit:

    To ensure that the highest generated year is the current year, use array_reverse() to process the data from latest entry to the earliest entry. Compare the standardized timestamp expression against the previous timestamp. When the current stamp is greater than the last, decrement the year. When finished processing, call array_reverse() on the result to return it to its original order.

    Code: (Demo)

    var_export(
        array_reverse(
            preg_replace_callback(
                '/^[a-z]{3} +\d+ \d\d:\d\d:\d\d/i',
                function($m) {
                    static $lastStamp = null;
                    static $year = null;
                    $year ??= date('Y');
                    $currentStamp = date('m d H:i:s', strtotime($m[0]));
                    if ($currentStamp > ($lastStamp ?? $currentStamp)) {
                        --$year;
                    }
                    $lastStamp = $currentStamp;
                    return "$year $currentStamp";
                },
                array_reverse($arr)
            )
        )
    );