Search code examples
pythonpowershelldecodeencodewindows-task-scheduler

Unicode issue with Python3.7 and Scheduled Tasks


I am trying to collect the names of the Scheduled tasks in Python using subprocess

import subprocess
import sys

encoding = 'utf-8'

cmd = r'''$env:PYTHONIOENCODING = "%s";py -3 -c "print('® ¾ ü_ä_ö')"'''% encoding
#cmd = r'''$env:PYTHONIOENCODING = "%s"; schtasks /query ''' % encoding

data = subprocess.check_output(["powershell", "-C",cmd])
print((data.decode(encoding)))

This works fine when I do the dummy cmd (print the Unicode). But when I try to run the schtasks command (some task like intel and others uses unicode symbols like ® in the task name or characters like ü_ä_ö ).

This gives me the following error:

UnicodeDecodeError: 'utf-8' codec can't decode byte 0x81 in position 1228: invalid start byte

If I run the command from cmd prompt or powershell directly it shows fine:

C:\Users\ricar\Google Drive\Bifrost\Collectors>schtasks /query

Folder: \
TaskName                                 Next Run Time          Status
======================================== ====================== ===============
Adobe Acrobat Update Task                12/4/2020 8:00:00 AM   Ready
AdobeAAMüpdater-1.0-MicrosoftAccount-ric 12/4/2020 2:00:00 AM   Ready
AdobeGCInvoker-1.0                       12/5/2020 12:30:00 AM  Ready
HPPSDrTelemetryWatch©                    12/12/2020 12:00:00 AM Ready
Intel-IMSS®                              N/A                    Ready

Any ideas what I am doing wrong?

Thanks


Solution

  • Are you sure the schtasks output is in utf-8?

    0x81 is ü in the IBM CP437 and IBM CP850 / IBM CP858 encodings.

    In order to check this, the pragmatic way is to print out the string with repr() or with one of the decode(encoding, errors=...) options that outputs character codes (eg. decode(encoding, errors='xmlcharrefreplace')), then match it up with tables of encodings to see which one matches.