Search code examples
powershellcmdpipe

CMD pipe different form Powershell pipe?


I am trying to pipe Node.js output to preatty-pino

node .\dist\GameNode.js | pino-pretty

running this in the CMD I get my formated output but running it inside a powershell I get nothing. I read that Powershell is using objects when piping, so I tried

node .\dist\GameNode.js | Out-String -Stream | pino-pretty

But this also does not work.

Why does it work inside CMD but not inside Powershell ? Thanks :)


Solution

  • Note: The specific pino-pretty problem described in the question is not resolved by the information below. Lukas (the OP) has filed a bug report here.

    It's surprising that you get nothing, but the fundamental difference is:

    • cmd.exe's pipeline conducts raw data, i.e. byte streams (which a given program receiving the data may or may not itself interpret as text).

    • PowerShell's pipeline, when talking to external programs, conducts only text (strings), which has two implications:

      • On piping data to an external program, text must be encoded, which happens based on the character encoding stored in preference variable $OutputEncoding.

      • On receiving data from an external program, data must be decoded, which happens based on the character encoding stored in [Console]::OutputEncoding, which by default is the system's OEM code page, as reflected in chcp.

        • In Windows PowerShell and PowerShell (Core) 7+ up to v7.3.x, this decoding happens invariably, irrespective of whether the data is then further processed in PowerShell or passed on to another external program.

          • In PowerShell (Core) v7.4+, however, raw byte data now is passed between external programs and to > or >>. Additionally, you can now use an array of bytes to send raw bytes to an external program - see this answer.
        • The only exception is if external-program output is neither captured, sent on through the pipeline, nor redirected to a file: in that case, the data prints straight to the console (terminal), but only in a local console (when using PowerShell remoting to interact with a remote machine, decoding is again invariably involved).

          • This direct-to-display printing can sometimes hide encoding problems, because some programs, notably python, use full Unicode support situationally in that case; that is, the output may print fine, but when you try to process it further, encoding problems can surface.
          • A simple way to force decoding is to enclose the call in (...); e.g.,
            python -c "print('eé')" prints fine, but
            (python -c "print('eé'))" surfaces an encoding problem; see the bottom section for more information

    While console applications traditionally use the active OEM code page for character encoding and decoding, Node.js always uses UTF-8.

    Therefore, in order for PowerShell to communicate properly with Node.js programs, you must (temporarily) set the following first:

    $OutputEncoding = [Console]::OutputEncoding = [System.Text.UTF8Encoding]::new()
    

    If you want to fundamentally switch to UTF-8, either system-wide (which has far-reaching consequences) or only for PowerShell console windows, see this answer.


    As an aside: an intermediate Out-String -Stream pipeline segment is never needed for relaying an external program's output - it is effectively (a costly) no-op, because streaming stdout output line by line is what PowerShell does by default. In other words: it is not surprising that it made no difference in your case.


    Optional reading: Convenience function Invoke-WithEncoding and diagnostic function Debug-NativeInOutput for ad-hoc encoding needs / diagnosis:

    If switching all PowerShell consoles to UTF-8 isn't an option and/or you need to deal with "rogue" programs that use a specific encoding other than UTF-8 or the active OEM code page, you can install:

    • Function Invoke-WithEncoding, which temporarily switches to a given encoding when invoking an external program, directly from this Gist as follows (I can assure you that doing so is safe, but you should always check):
    # Download and define advanced function Invoke-WithEncoding in the current session.
    irm https://gist.github.com/mklement0/ef57aea441ea8bd43387a7d7edfc6c19/raw/Invoke-WithEncoding.ps1 | iex
    
    • Function Debug-NativeInOutput, which helps diagnose encoding problems with external programs, directly from this Gist as follows (again, you should check first):
    # Download and define advanced function Debug-NativeInOutput in the current session.
    irm https://gist.github.com/mklement0/eac1f18fbe0fc2798b214229b747e5dd/raw/Debug-NativeInOutput.ps1 | iex
    

    Below are example commands that use a python command to print an accented character.

    Like Node.js, Python's behavior is nonstandard, although it doesn't use UTF-8, but the system's active ANSI(!) code page (rather than the expected OEM code page).

    That is, even if you switch your PowerShell consoles UTF-8, communication with Python scripts won't work properly by default, unless extra effort is made, which Invoke-WithEncoding can encapsulate for you:

    Note: I'm using Python as an example here, to illustrate how the functions work. It is possible to make Python use UTF-8, namely by either setting environment variable PYTHONUTF8 to 1 or - in v3.7+ - by passing parameter -X utf8 (case-exactly).


    Invoke-WithEncoding example:

    # Outputs *already-decoded* output, so if the output *prints* fine, 
    # then *decoding* worked fine too.
    PS> Invoke-WithEncoding { python -c "print('eé')" } -Encoding Ansi -WindowsOnly
    eé
    
    • Note that Invoke-WithEncoding ensures that actual decoding to a .NET string happens before it outputs, so that encoding problems aren't accidentally masked by the direct-to-display output seemingly being correct on Windows (see below for more).

    • -WindowsOnly is for cross-platform compatibility and ensures that the encoding is only applied on Windows in this case (on Unix, Python uses UTF-8).


    Debug-NativeInOutput example:

    With the PowerShell console at its default, using the system's OEM code page, you'll see the following output with the same Python command, calling from PowerShell (Core) 7.1:

    PS> Debug-NativeInOutput { python -c "print('eé')" }
    

    Debug-NativeInOutput

    • Note the DecodedOutput property, showing the mis-decoded result based on interpreting Python's output as OEM- rather than as ANSI-encoded: 'eΘ'. (The Input* properties are blank, because the command did not involve piping data to the Python script.)

    • By contrast, with direct-to-display printing the output prints fine (because Python then - and only then - uses Unicode), which hides the problem, but as soon you want to programmatically process the output - capture in a variable, send to another command in the pipeline, redirect to a file - the encoding problem will surface.

    • Like Invoke-WithEncoding, Debug-NativeInOutput supports an -Encoding parameter, so if you pass -Encoding Ansi to the call above, you'll see that Python's output is decoded properly.

    • The output reflects the fact that, in PowerShell (Core), $OutputEncoding defaults to UTF-8, whereas in Windows PowerShell it defaults to ASCII(!). This mismatch with the actual encoding in effect in the console window is problematic, and this comment on GitHub issue #14945 proposes a way to resolve this (for PowerShell (Core) only) in the future.