Background information:
If January lines do not appear in the log file, then it is only about the current year. Example:
If January occurs one or more times in the month cycle, then the current year is from the beginning of the last occurrence (January). Example:
Code:
Recognize which year it is based on the 3-letters and lines and add the correct year to each line.
$arr = [
// without first Dec, Jan and Jan lines:
// all subsequent lines are the current year
"Dec 23 21:37:56 hello",
"Jan 12 02:08:23 hello",
"Jan 16 17:34:33 hello",
"Feb 4 12:21:09 hello",
"Mar 19 17:07:26 hello",
"Apr 1 00:00:03 hello",
"Apr 12 23:07:39 hello",
"May 21 04:09:34 hello",
"Jun 7 23:34:56 hello",
"Jul 1 14:45:34 hello",
"Aug 13 11:37:23 hello",
"Sep 29 07:36:03 hello",
"Oct 30 09:01:00 hello",
"Nov 10 11:00:03 hello",
"Dec 25 21:47:51 hello"
];
Create a function to find the years.
function setYear()
{
global $arr, $y;
$first = explode(' ', $arr[array_key_first($arr)]);
// if the 1st line doesn't start with Jan, then it's the previous year.
if (!in_array('01', $first)) {
$y = date("Y", strtotime("-1 year"));
} else {
$y = date("Y");
}
return $y;
}
Convert date year and month integer
$arr = preg_replace_callback(
'/^(\w+)\s+(\d+)\s/',
function ($matches) {
global $y;
$yy = setYear($y);
return date($yy . ' m d', strtotime($matches[0] . ' ' . date("Y"))) . ' ';
},
$arr
);
echo '<pre>';
print_r($arr);
echo '</pre>';
Unexpected result:
Array
(
[0] => 2022 12 23 21:37:56 hello
[1] => 2022 01 12 02:08:23 hello
[2] => 2022 01 16 17:34:33 hello
[3] => 2022 02 04 12:21:09 hello
[4] => 2022 03 19 17:07:26 hello
// ...
[9] => 2022 11 10 11:00:03 hello
[10] => 2022 12 25 21:47:51 hello
)
Expected result:
Array
(
[0] => 2023 12 23 21:37:56 hello
[1] => 2022 01 12 02:08:23 hello
[2] => 2022 01 16 17:34:33 hello
[3] => 2022 02 04 12:21:09 hello
[4] => 2022 03 19 17:07:26 hello
// ...
[9] => 2022 11 10 11:00:03 hello
[10] => 2022 12 25 21:47:51 hello
)
Use a static
variable instead of bringing global variables into scope with global
. The static
keyword will ensure that the previous iterations' declaration is retained and is accessible. If Jan
is encountered or was encountered before, set the flag as true
. Until the flag is set to true
, subtract 1 year from the date's year.
Code: (Demo)
var_export(
preg_replace_callback(
'/^([a-z]{3}) +\d+/i',
function($m) {
static $encounteredJan = false;
$encounteredJan = $encounteredJan || $m[1] === 'Jan';
return date('Y m d', strtotime($m[0] . ($encounteredJan ? '' : ' -1 year')));
},
$arr
)
);
If you cannot rely on Jan existing in the dataset, then (assuming you never need to jump more than one year forward), just check if the current month is less than the last encountered month. If, say, going from Sep to Apr (10 to 4), then you can safely assume that the year should be increased/incremented.
Code: (Demo)
var_export(
preg_replace_callback(
'/^([a-z]{3}) +\d+/i',
function($m) {
static $lastMonthInt = 0;
static $year = null;
$year ??= date('Y', strtotime('-1 year'));
$currentMonthInt = date('n', strtotime($m[1]));
if ($currentMonthInt < $lastMonthInt) {
++$year;
}
$lastMonthInt = $currentMonthInt;
return "$year " . date('m d', strtotime($m[0]));
},
$arr
)
);
Final edit:
To ensure that the highest generated year is the current year, use array_reverse()
to process the data from latest entry to the earliest entry. Compare the standardized timestamp expression against the previous timestamp. When the current stamp is greater than the last, decrement the year. When finished processing, call array_reverse()
on the result to return it to its original order.
Code: (Demo)
var_export(
array_reverse(
preg_replace_callback(
'/^[a-z]{3} +\d+ \d\d:\d\d:\d\d/i',
function($m) {
static $lastStamp = null;
static $year = null;
$year ??= date('Y');
$currentStamp = date('m d H:i:s', strtotime($m[0]));
if ($currentStamp > ($lastStamp ?? $currentStamp)) {
--$year;
}
$lastStamp = $currentStamp;
return "$year $currentStamp";
},
array_reverse($arr)
)
)
);