Search code examples
pythonfilepyqtfiledialog

UnicodeDecodeError with QFileDialog in PyQt


Hello I am having an issue with my program when it comes to a file dialog function I have.

First here is my code:

def getFileInfo(self):
    global logName
    logName = QtGui.QFileDialog.getOpenFileName()
    return logName

def getFileName(self):
    return logName

def compareAction(self):
    def process(infile, outfile, keywords):
        keys = [[k[0], k[1], 0] for k in keywords]
        endk = None
        with open(infile, 'rb') as fdin:
            with open(outfile, 'ab') as fdout:
                fdout.write("<" + words + ">" + "\r\n")
                for line in fdin:
                    if endk is not None:
                        fdout.write(line)
                        if line.find(endk) >= 0:
                            fdout.write("\r\n")
                            endk = None
                    else:
                        for k in keys:
                            index = line.find(k[0])
                            if index >= 0:
                                fdout.write(line[index + len(k[0]):].lstrip())
                                endk = k[1]
                                k[2] += 1
        if endk is not None:
            raise Exception(endk + "Not found before end of file")
        return keys
    clearOutput = open('test.txt', 'wb')
    clearOutput.truncate()
    clearOutput.close()
    outputText = 'test.txt'
    end_token = "[+][+]"
    inputFile = logName

    start_token = self.serialInputText.toPlainText()
    split_start = start_token.split(' ')
    for words in split_start:
        process(inputFile,outputText,((words + "SHOWALL"),))
        fo = open(outputText, "rb")
        text = fo.read()

    print start_token + '\r\n'
    print split_start
    print inputFile

Okay, So the general idea of this piece of code is grabbing a some inputted text from a TextEdit in my PyQt GUI. Then, splitting that string into a List that can be used to 'scan' through the file and if there are any matches then print out those matches into another text document.

Steps:

  1. User inputs texts into TextEdit
  2. Texts inside TextEdit gets stored into a QString
  3. That QString has a space as a delimiter so we split each entry into a list. i.e This is a list -> [u'This', u'Is', u'A', u'List'] (The list has a u due to my code using sip)
  4. Now that we have this QStringList we can pass it through my def process function.
  5. We need a file to search through obviously, this is where the def getFileInfo(self) and def GetFileName(Self) function come into play.
  6. So after the user has inputted some text, selected a file to search through, he/she will press a Button, lets call it CompareButton, and it will execute the def compareAction(self) function.

Issue

Currently, my issue is this error that appears after doing all the steps it fails on step number 6. This is my error:

Traceback (most recent call last):
  File "RETRACTED.py", line 278, in compareAction
    process(inputFile,outputText,((words + "SHOWALL"),))
  File "RETRACTED.py", line 260, in process
    index = line.find(k[0])
UnicodeDecodeError: 'ascii' codec can't decode byte 0xef in position 0: ordinal not in range(128)

I am unsure as to why this error is happening. I have been searching for a similar issue but i believe it has to do with my process function. I am unsure


Solution

  • That specific error:

    UnicodeDecodeError: 'ascii' codec can't decode byte 0xef in position 0: ordinal not in range(128)
    

    looks like a problem with an (unexpected) Byte Order Mark (BOM) in the input file. I suspect the log file is UTF-8 with BOM.

    Try changing your file open line to:

    open(infile, 'rb', encoding='utf-8-sig')
    

    to have the the BOM marker stripped from the file.