Search code examples
powershellpipelinepowershell-core

Stream powershell output objects to format-table


I'm writing a powershell function to process lines from a log file. The log file is being written to by another process. In order to make viewing the logs easier I'm parsing each log line into an object and passing that along the pipeline. I've added a -Wait parameter which will continue watching for changes instead of exiting.

Note: This is a trimmed down version with hard-coded fields but it shows the same behavior

function Watch-Log($Path, [switch]$Wait) {

    [System.IO.FileStream]$log = $null
    [System.IO.StreamReader]$logReader = $null

    $logPath = Split-Path -Parent $Path
    $logFilter = Split-Path -Leaf $Path
    $watcher = [System.IO.FileSystemWatcher]::new($logPath, $logFilter)
    $watcher.IncludeSubdirectories = $false
    $createdEventIdentifier = "Service_LogFileCreated"
    Register-ObjectEvent $watcher Created -SourceIdentifier $createdEventIdentifier

    $watcher.EnableRaisingEvents = $true

    function Open-ServiceLogStream([ref][System.IO.FileStream]$log, [ref][System.IO.StreamReader]$logReader) {
        if ($log.Value) { $log.Value.Close() }

        $log.Value = [System.IO.File]::Open($Path, "Open", "Read", "ReadWrite,Delete");
        $logReader.Value = [System.IO.StreamReader]::new($log.Value)
    }

    try {
        Open-ServiceLogStream ([ref]$log) ([ref]$logReader)

        while ($true) {
            $line = $logReader.ReadLine()
            if (!$line) {

                if (!$Wait) { return }

                $newLogEvent = Get-Event -SourceIdentifier $createdEventIdentifier -ErrorAction SilentlyContinue

                if ($newLogEvent) {
                    Remove-Event -EventIdentifier $newLogEvent.EventIdentifier
                    Open-ServiceLogStream ([ref]$log) ([ref]$logReader)
                }
                else {
                    Start-Sleep -Milliseconds 100
                }
                continue;
            }

            $lineParts = $line -split " "
            [PSCustomObject]@{ PSTypeName = "Test.LogLine"; Date = $lineParts[0]; Priority = $lineParts[1]; LogMessage = $lineParts[2]; }
        }
    }
    finally {
        Unregister-Event -SourceIdentifier $createdEventIdentifier
    }
}

So it might be used like this:

> Watch-Log test.log | Select Date, Priority, LogMessage

Date               Priority LogMessage
----               -------- ----------
7/10/2023 19:18:50 Message  New Log Opened
7/10/2023 19:19:33 Message  Line Here!
7/10/2023 19:19:33 Message  Line Here!
...

Unfortuantely, if I'm using the -Wait parameter the "output" of the pipeline is buffered forever and never gets printed. Ideally, I would like it to start printing the output in the table, and add new rows as logs are emitted. I'm okay if the column widths stay fixed based on the first log line it sees.

I can trick it into printing stuff immediately by using Out-Host

Watch-Log test.log -Wait | Out-Host

But it's frustrating to require other users of this to know they need to append Out-Host, and if it's wrapped in another function than you can't force that.

Update

I've finally uncovered the source of the problem after being unable to repro the issue on a machine without my default Powershell profile.

A few years ago I was doing something in Powershell and ended up overriding Out-Default in my powershell profile.

I've been dragging this profile around for quite some time (since the Windows Powershell only days I think?) and totally forgot about this.

function Out-Default {
  $input | Tee-Object -var global:last | Microsoft.PowerShell.Core\out-default
}

Since this isn't a pipeline function, it blocks as it gathers everything into $input before passing it onto Tee-Object which is causing the default formatting to break.

Yes, I've been trying to figure this out for days and FINALLY only moments after submitting a bug about it I uncovered the problem! Thanks everyone for helping get me pointed in the right direction.


Solution

  • The likely culprit is the infamous 300-millisecond delay that Format-Table uses in order to collect enough input data so as to make a reasonable guess how wide the output columns should be, before it starts to produce display output.

    • See this answer for a detailed explanation.

    • The most common gripe about this behavior is that - when Format-Table is used implicitly - it can interfere with output ordering across PowerShell's different output streams - see this answer.


    However, as the following example shows, the 300-millisecond delay can result in potentially much longer delays, and even potentially in never producing display output:[1]

    # Note: Omitting Format-Table exhibits the same behavior, because
    #       an object with 4 or fewer properties is *implicitly* table-formatted,
    #       unless the default formatting is governed by predefined formatting
    #       data associated with the object's type.
    & { [pscustomobject] @{ foo = 1 }; Start-Sleep 10 } | Format-Table
    

    This doesn't print the table until 10 seconds later, namely after the Start-Sleep call returns and the input command has terminated.

    The problem also occurs with use of cmdlets that wait for an unspecified period for an event to occur, namely Wait-Job, Wait-Process and Wait-Event.


    The problem is that Format-Table doesn't regain control until it either receives the next input object or the input command terminates.


    Workarounds:

    The 300-millisecond delay can be avoided if Format-Table knows the column widths ahead of time, which can be achieved in one of two ways:

    • Specify the column widths via calculated properties, as zett42 recommends:

      # Starts to produce display output *instantly*, 
      # due to the predefined column width.
      & { [pscustomobject] @{ foo = 1 }; Start-Sleep 10 } | 
        Format-Table -Property @{ Expression='foo'; Width = 10 }
      
      • Unfortunately, you'll have to do that for every output property (column).

      • Also, any explicit use of a Format-* cmdlet invariably interferes with using the output as data - see this answer.

    • Associate predefined formatting data with the .NET type of your output objects and define a table view with fixed column widths.

      • This is the preferable solution - not least because it doesn't interfere with outputting data - but requires the nontrivial effort of authoring a formatting file (*._Format.ps1xml), which must be loaded into the session first.

      • Additionally, you must make Write-Log either use a specific .NET type associated with your formatting data or add a PSTypeName property to [pscustomobject] output objects (e.g, [pscustomobject] @{ PSTypeName = 'My.Type'; foo = 1 })


    [1] Strictly speaking, an initial, empty line always prints, to be followed by the actual table only after the 300-millisecond delay has elapsed.