Search code examples
pythonpowershellencodingutf-8

python3 powershell stdout cyrillic encoding


I need to do the following: I connect to the Windows server from Python via ssh and run the powershell command on it. I need to get a response from that server and use it further. If I make a request like this:

command_company = (
            'powershell.exe -noprofile -command "'
            'Get-ADUser -Filter \\"sAMAccountName -like \'j.d\'\\" -Properties sAMAccountName, Mobile | Select-Object -ExpandProperty Mobile"'
        )

Then everything works for me and I get a mobile number. If I do like this:

command_company = (
            'powershell.exe -noprofile -command "'
            'Get-ADUser -Filter \\"sAMAccountName -like \'j.d\'\\" -Properties sAMAccountName, Company | Select-Object -ExpandProperty Company"'
        )

then I don't get an error:

powershell.exe -noprofile -command "Get-ADUser -Filter \"sAMAccountName -like 'j.d'\" -Properties sAMAccountName, Company | Select-Object -ExpandProperty Company"
env: {}, command: powershell.exe -noprofile -command "Get-ADUser -Filter \"sAMAccountName -like 'j.d'\" -Properties sAMAccountName, Company | Select-Object -ExpandProperty Company", subsystem: None, exit_status: None, exit_signal: None, returncode: None, stdout: , stderr:
Output:
Exit Code: None

I'm guessing this might be an encoding issue with the response that powershell sends. If I execute the request directly in the PS

Get-ADUser -Filter "sAMAccountName -like 'j.d'" -Properties sAMAccountName, Company | Select-Object -ExpandProperty Company

I get the answer:

PS C:\Windows\system32> C:\ServiceScripts\Test.ps1
ООО МерседесБенцРус

How to change response encoding?

Full code:

async def run_ssh_command(user_login):
    async with asyncssh.connect(edited) as conn:

        command_company = (
            'powershell.exe -noprofile -command "'
            'Get-ADUser -Filter \\"sAMAccountName -like \'j.d\'\\" -Properties sAMAccountName, Company | Select-Object -ExpandProperty Company"'
        )
        print(command_company)
        result = await conn.run(command_company)
        print(result)
        output_company = result.stdout
        exit_code = result.exit_status
        print(f"Output: {output_company}")
        print(f"Exit Code: {exit_code}")
        await client.send_message(chat_id, f'From: {output_company}', reply_to=reply_to_msg_id)
        return output_company

Solution

  • It looks like you found a solution, by switching to a different SSH library that allows you to specify the character encoding for decoding the stdout output received explicitly.

    • Let me add some background information...

    • ... and offer Python v3+ solution that may work with your original code: I don't know how SSH factors into this (your original library, asyncssh, claims to be using UTF-8), but the sample code below definitely works with a local PowerShell call that uses UTF-8, as shown.


    The PowerShell CLI in both editions - powershell.exe for Windows PowerShell, pwsh.exe for PowerShell (Core) 7+) - encodes its stdout output based on the active (output) console code page, which defaults to the system's OEM code page.

    If you don't want to hard-code it, as in your code, you can use the following to determine it:

    from ctypes import windll; cp = windll.kernel32.GetConsoleOutputCP()
    

    If you want to use UTF-8 instead - which is a necessity if the PowerShell command outputs characters that cannot be represented in the OEM code page - you can do the following:

    import subprocess
    import sys
    from ctypes import windll
    
    # Sample text containing an ASCII char., 
    # a non-ASCII char that can be mapped onto at least some OEM code pages, 
    # and one that cannot.  
    text = 'e€╳' 
    # A sample PowerShell command that simply echoes the text.
    command = f"powershell.exe -noprofile -c '{text}'"
    
    # Save the current console output code page and temporarily change it to
    # UTF-8 (65001)
    oldCp = windll.kernel32.GetConsoleOutputCP()
    windll.kernel32.SetConsoleOutputCP(65001)
    
    # Call the PowerShell CLI, which now emits UTF-8 text.
    # Note the use of .decode(), which is based on sys.getdefaultencoding(), 
    # which is UTF-8 in v3+, and the .strip() call to remove a trailing newline.
    out = subprocess.run(command, stdout=subprocess.PIPE).stdout.decode().strip()
    
    # Restore the previous output console code page.
    windll.kernel32.SetConsoleOutputCP(oldCp)
    
    # Print the captured output and its byte representation.
    print('received text: ' + out)
    print(out.encode()) # byte representation
    

    Note:

    • The above only ensures that the PowerShell child process emits UTF-8 and that its output is decoded as such inside the Python process, which is unrelated to what character encoding Python itself uses for its output streams.

    • To put Python v3.7+ itself in Python UTF-8 Mode, which makes it decode input as UTF-8 and produce UTF-8 output, pass command-line option -X utf8 or define environment variable PYTHONUTF8 with a value of 1 before invocation.

      • (Only) if you do that, the alternative to calling .decode() above is to pass text=True as an additional argument to the subprocess.run() call.

      • Independently, passing encoding='utf-8' as an additional argument instead of calling .decode() is an option, too.