Search code examples
datepowershelllogfileslogfile

How can I extract the latest rows from a log file based on latest date using Powershell


I'm a relatively new Powershell user, and have what I thought was a simple question. I have spent a bit of time looking for similar scenarios and surprisingly haven't found any. I would post my failed attempts, but I can't even get close!

I have a log file with repetitive data, and I want to extract the latest event for each "unique" entry. The problem lies in the fact that each entry is unique due to the individual date stamp. The "unique" criteria is in Column 1. Example:

AE0440,1,2,3,30/08/2012,12:00:01,XXX
AE0441,1,2,4,30/08/2012,12:02:01,XXX
AE0442,1,2,4,30/08/2012,12:03:01,XXX
AE0440,1,2,4,30/08/2012,12:04:01,YYY
AE0441,1,2,4,30/08/2012,12:06:01,XXX
AE0442,1,2,4,30/08/2012,12:08:01,XXX
AE0441,1,2,5,30/08/2012,12:10:01,ZZZ

Therefore the output I want would be (order not relevant):

AE0440,1,2,4,30/08/2012,12:04:01,YYY
AE0442,1,2,4,30/08/2012,12:08:01,XXX
AE0441,1,2,5,30/08/2012,12:10:01,ZZZ

How can I get this data/discard old data?


Solution

  • Try this, it may look a bit cryptic for first time user. It reads the content of the file, groups the lines by the unique value (now we have 3 groups), each group is sorted by parsing the date time value (again by splitting) and the first value is returned.

    Get-Content .\log.txt | Group-Object { $_.Split(',')[0] } | ForEach-Object {    
        $_.Group | Sort-Object -Descending { [DateTime]::ParseExact(($_.Split(',')[-3,-2] -join ' '),'dd/MM/yyyy HH:mm:ss',$null) } | Select-Object -First 1    
    }
    
    AE0440,1,2,4,30/08/2012,12:04:01,YYY
    AE0441,1,2,5,30/08/2012,12:10:01,ZZZ
    AE0442,1,2,4,30/08/2012,12:08:01,XXX