Search code examples
powershelldatecomparetimespan

Find adjacent CSV rows whose timestamps exceed a given duration


I have a file with the following:

"BatchID","BatchName","JobName","ScannerID","DateCreated","DailyCounter"
"12713","9001","fst100","09                  ","2/6/2020 6:21:06 AM","1"
"12714","9002","fst100","09                  ","2/6/2020 6:21:42 AM","2"
"12715","9003","fst100","09                  ","2/6/2020 6:21:58 AM","3"
"12716","9004","fst100","09                  ","2/6/2020 6:22:13 AM","4"
"12717","9005","fst100","09                  ","2/6/2020 6:22:30 AM","5"
"12718","9006","fst100","09                  ","2/6/2020 6:24:54 AM","6"
"12719","9007","fst100","09                  ","2/6/2020 6:25:12 AM","7"
"12720","9008","fst100","09                  ","2/6/2020 6:55:11 AM","8"
"12721","9009","fst100","09                  ","2/6/2020 6:55:27 AM","9"

I want to compare date created fields and write to a file those lines that are 15 minutes or more apart.


Solution

  • Since:

    • your timestamp strings are in a format that is recognized as-is when you cast them to [datetime] (System.DateTime) (unless your format lists the day rather than the month first),

    • you can directly subtract two [datetime] instances to get a [timespan] instance (System.TimeSpan) whose .TotalMinutes property you can query

    the solution is fairly straightforward:

    $prevRow = $null
    Import-Csv in.csv | ForEach-Object { # Parse the CSV rows into objects and process each
      if ($prevRow -and ([datetime] $_.DateCreated - [datetime] $prevRow.DateCreated).TotalMinutes -gt 15) {
        # Output the previous row and the current row.
        $prevRow; $_
      }
      $prevRow = $_
    } # | Export-Csv -NoTypeInformation -Encoding Utf8 out.csv
    

    The above will print the matching rows to the screen; remove the # to export to a (different) CSV file.

    Note: As written, you can end up with duplicate rows in the output, namely if in adjacent rows A B C both A and B and B and C are more than 15 minutes apart, in which case A B B C is output.

    Documentation links: