Search code examples
regexpowershellparsingfile-get-contentsselect-string

Powershell3: discern and display last n Lines from an ascii file


I think this should be simple. I write the logging output of xcopy to a plain text file, with a daily delimiter (literally) "++++++++++++++++++++Tue 07/03/2018 0900 PM" appended to the log file prior to each daily backup. So the last lines in the file typically look like this:

daily delimiter

A new day appends a new delimiter line and so on.

I want to display the LAST delimiter and the lines which follow it to eof.

The schema I've tried GET-Content, Select-String -Context 0,20 don't work,

PS says my search string ++++++++++++++++++++ isn't a regular expression, doesn't recognize path etc. etc. Any help?

Memory and time are not at issue. Sorry if this is too simple.


Solution

  • msjqu's helpful answer explains the need to escape + chars. as \+ in a regex in order for these chars. to be treated as literals.

    Thus, the regex to match a header line - 20 + chars. at the start of a line (^) - is: ^\+{20}

    That said, if it is sufficient to detect header lines by 20 + signs, Get-Content -Delimiter - which supports only literals as delimiters - offers a simple and efficient solution (PSv3+; assumes input file some.log in the current directory ./):

     $headerPrefix = '+' * 20  # -> '++++++++++++++++++++'
     $headerPrefix + (Get-Content ./some.log -Delimiter $headerPrefix -Tail 1)
    

    -Delimiter uses the specified header-line signature to break the file into "lines" (text between instances of the delimiter, which are blocks of lines here) and -Tail 1 returns the last "line" (block) by searching for it from the end of the file. Tip of the hat to mjsqu for helping me arrive at this solution.


    The following alternative solutions are regular-expression-based, which enables more sophisticated header-line matching.

    Note: While none of the solutions below require reading the log file into memory as a whole, they do read through the entire file, not just from the end.


    We can use this in a switch -regex -file statement to process all lines of the log file in order to collect the lines that start with and follow the last ^\+{20} match; the code assumes input file path ./some.log:

    # Process all lines in the log file and 
    # collect each block's lines along the way in 
    # array $lastBlockLines, which means that after 
    # all lines have been processed, $lastBlockLines contains
    # the *last* block's lines.
    switch -regex -file ./some.log {
      '^\+{20}' { $lastBlockLines = @($_) } # start of new block, (re)initialize array
      default   { $lastBlockLines += $_ }   # add line to block
    }
    
    # Output the last block's lines.
    $lastBlockLines
    

    Alternatively, if you're willing to assume a fixed maximum number of lines in a block, a single-pipeline solution using Select-String is possible:

    Select-String '^\+{20}' ./some.log -Context 0,100 | Select-Object -Last 1 | 
      ForEach-Object { $_.Line; $_.Context.PostContext }
    
    • Select-String '^\+{20}' ./some.log -Context 0,100 matches all header lines in file ./some.log and, thanks to -Context 0, 100, includes (up to) 100 lines that follow a matching line in the match object that is emitted (the 0 means that no lines that precede a matching line are to be included).

    • Select-Object -Last 1 passes only the last match on.

    • ForEach-Object { $_.Line; $_.Context.PostContext } then outputs the last match's matching line as well as the up to 100 lines that follow it.


    If you don't mind reading the file twice, you can combine Select-String with Get-Content ... | Select-Object -Skip:

    Get-Content ./some.log | Select-Object -Skip (
        (Select-String '^\+{20}' ./some.log | Select-Object -Last 1).LineNumber - 1
      )
    

    This takes advantage of the fact that the match objects emitted by Select-String have a .LineNumber property reflecting the number of the line on which a given match was found. Passing the last match's line number minus 1 to Get-Content ... | Select-Object -Skip then outputs the matching line as well as all subsequent ones.