Search code examples
powershellformatoutput

Text output from other programs in Powershell contain unwanted spaces between characters


I would like to work with (text)output from other programs in Powershell, but all the time I am getting this weird format with unwanted whitespaces between every character.

For example, when running:

&"C:\Program Files\Common Files\McAfee\SystemCore\aacinfo.exe" report

It produces output like:

Subscribing for reports, press any key to stop...


Reporting on point-product 12dd49a3-c20f-4636-b638-fc9ce8c76735
Waiting for events for point-product ProductTracker...

Reporting on point-product 4562a25a-895d-4e20-bb47-7317801f0108
Waiting for events for point-product ProductTracker...

and so so... it is a continuous stream until the program is stopped.

When I try to save it to a file with:

& "C:\Program Files\Common Files\McAfee\SystemCore\aacinfo.exe" report | out-file -encoding utf8 C:\temp\output.log

The file content looks like this:

S u b s c r i b i n g   f o r   r e p o r t s ,   p r e s s   a n y   k e y   t o   s t o p . . . 
 
 
 
 
 
 R e p o r t i n g   o n   p o i n t - p r o d u c t   1 2 d d 4 9 a 3 - c 2 0 f - 4 6 3 6 - b 6 3 8 - f c 9 c e 8 c 7 6 7 3 5 
 
 W a i t i n g   f o r   e v e n t s   f o r   p o i n t - p r o d u c t   P r o d u c t T r a c k e r . . . 
 
 
 
 R e p o r t i n g   o n   p o i n t - p r o d u c t   4 5 6 2 a 2 5 a - 8 9 5 d - 4 e 2 0 - b b 4 7 - 7 3 1 7 8 0 1 f 0 1 0 8 
 
 W a i t i n g   f o r   e v e n t s   f o r   p o i n t - p r o d u c t   P r o d u c t T r a c k e r . . . 

Why is that and how to avoid all those additional spaces between the characters?


Solution

  • It looks like aacinfo.exe unexpectedly outputs UTF-16LE-encoded text instead of using the character encoding specified by the current console's code page (as reflected in the output from chcp).

    Because PowerShell expects the latter, it misinterprets aacinfo.exe's output, which explains your symptom.

    • In short: What appear to be extra spaces - e.g. in Notepad - are actually NUL characters (characters with code point 0x0) that result from the mistaken interpretation of the UTF-16 bytes as individual characters, so that each 0x0 byte that makes up one half of a two-byte sequence encoding an ASCII-range character mistakenly becomes a character in its own right.

    The solution is to - temporarily - set [Console]::OutputEncoding to UTF-16LE ([System.Text.Encoding]::Unicode), which instructs PowerShell to use that encoding for decoding aacinfo.exe's output:

    $prev = [Console]::OutputEncoding
    [Console]::OutputEncoding = [System.Text.Encoding]::Unicode
    
    & 'C:\Program Files\Common Files\McAfee\SystemCore\aacinfo.exe' report | 
      Out-File -Encoding utf8 C:\temp\output.log
    
    [Console]::OutputEncoding = $prev
    

    Note:

    • In Windows PowerShell and PowerShell (Core) up to v7.3.x, PowerShell invariably decodes output from external programs into .NET strings before processing it further, even when using >, the redirection operator.

    • In PowerShell v7.4+ > is now capable of passing the raw byte stream of an external program's stdout output through to a file (whereas the decoding into strings still happens if you use Out-File explicitly).

      • This not only speeds up I/O redirection with external programs but also prevents potential mistinterpretation.

      • That said, capturing raw UTF-16LE data, specifically, will only result in a well-formed file if the byte stream starts with a BOM, which is not typical for to-stdout output.

    • For a given external program that exhibits the problematic behavior at hand, it is worth checking if a different output character encoding can be requested, such as via a CLI parameter (option) or an environment variable.

      • For instance, the wsl.exe CLI also outputs UTF-16LE by default for certain subcommands, but can be instructed to use UTF-8 by setting the WSL_UTF8 environment variable ($env:WSL_UTF8=1); however, unless you're using > in 7.4+ or the output is all-ASCII, you'll still have to (temporarily) modify [Console]::OutputEncoding to set it to UTF-8 ([Console]::OutputEncoding = [Text.UTF8Encoding]::new())