I'm trying to process a log file generated by my framework. It looks like this:
log-2023-05-23.log:
INFO - 2023-05-23 09:10:45 --> CSRF token verified.
CRITICAL - 2023-05-23 09:24:54 --> Undefined array key 158
in APPPATH/Views/tables/components/record_year__months_regions.php on line 18.
1 APPPATH/Views/tables/components/record_year__months_regions.php(18): CodeIgniter\Debug\Exceptions->errorHandler(2, 'Undefined array key 158', 'FILEPATH', 18)
...
17 FCPATH/index.php(67): CodeIgniter\CodeIgniter->run()
CRITICAL - 2023-05-23 09:26:33 --> Undefined array key 158
in APPPATH/Views/tables/components/record_year__months_regions.php on line 18.
...
Many more entries starting with CRITICAL/INFO/... are following after this.
I'm trying to read this file line by line and depending on the first word in each line (INFO
,CRITICAL
etc.), I'll either save it into a new array entry or prepend the line to the previous one.
I currently have this function which does exactly half of the job. It adds the first, third, fifth and so on entry into the array. But every other entry is dropped.
function.php:
while (! $file->eof()) {
$line = $file->fgets();
$words = explode(' ', $line);
$firstWord = $words[0] ?? '';
$timeWord = $words[3] ?? '';
if ($isCollecting) {
if (in_array($firstWord, $this->keyWords, true)) {
// Found the start of a new entry, so finalize the current entry
$output[$date][$keyWord][$time] = $entryLines;
$isCollecting = false;
$keyWord = null;
$time = null;
$entryLines = [];
} else {
// Continue collecting lines for the current entry
$entryLines[] = $line;
}
} elseif (in_array($firstWord, $this->keyWords, true)) {
// Found the start of a new entry
$isCollecting = true;
$keyWord = $firstWord;
$time = $timeWord;
$entryLines[] = $line;
}
}
// Finalize the last entry if still collecting lines
if ($isCollecting) {
$output[$date][$keyWord][$time] = $entryLines;
}
My resulting array looks like this:
array(1) {
["2023-05-23"]=>
array(2) {
["INFO"]=>
array(1) {
["09:10:45"]=>
array(1) {
[0]=>
string(52) "INFO - 2023-05-23 09:10:45 --> CSRF token verified."
}
}
["CRITICAL"]=>
array(4) {
["09:26:33"]=> ...
I can't get my head around on how to solve this problem.
I personally find it much easier to look at and reason about objects instead of arrays, particularly when they get deeply nested. There's minor overhead, but I think as long as you keep the objects simple that's worth it.
class LogData
{
private const LOG_TYPE_MISSING_PARENT = 'MISSING_PARENT';
public array $lines = [];
public function __construct(public readonly string $type, public readonly ?string $dateTime = null)
{
}
/**
* This object is used just in case the log file does not start with a log type.
*/
public static function createMissingParentHolder(): self
{
return new self(self::LOG_TYPE_MISSING_PARENT);
}
}
One weird thing is the MISSING_PARENT
thing. It is probably overkill, but it exists to defend against a theoretical edge case where the first line is missing a keyword, so at least we have somewhere to stash that line. This can obviously be dropped but I'd rather be safe than sorry, especially since this log reader is reading exception logs, and you don't want an exception.
Then we can create a variable such as $currentLogObject
that is an instance of that class, stash info and lines in it, store it in a global array as needed, and re-instantiate when we detect a new grouping.
One advantage of this is that we are always collecting, we don't need to keep track of that fact.
$keywords = ['INFO', 'CRITICAL', 'ERROR', 'WARNING', 'DEBUG'];
$logObjects = [];
$currentLogObject = null;
while (($line = fgets($fp)) !== false) {
$words = explode(' ', $line);
$firstWord = $words[0] ?? '';
$timeWord = $words[3] ?? '';
// See if we found a special word
if (in_array($firstWord, $keywords, true)) {
// If we have an existing object, store it
if ($currentLogObject) {
$logObjects[] = $currentLogObject;
}
// Create a new object
$currentLogObject = new LogData($firstWord, $timeWord);
} elseif ($currentLogObject === null) {
// We didn't find a special word, so make sure something exists to stash the line
$currentLogObject = LogData::createMissingParentHolder();
}
// No matter what happens above, stash the line
$currentLogObject->lines[] = $line;
}
// At the end of the loop, make sure to store the last object
if ($currentLogObject !== null) {
$logObjects[] = $currentLogObject;
}