I am trying to use the python CSV reader for the first time. I have a method that asks the user to select the file that they want to parse and then it passes that file path to the parse method:
def parse(filename):
parsedFile = []
with open(filename, 'rb') as csvfile:
dialect = csv.Sniffer().sniff(csvfile.read(), delimiters=';,|')
csvfile.seek(0)
reader = csv.reader(csvfile, dialect)
for line in reader:
parsedFile.append(line)
return(parsedFile)
def selectFile():
print('start selectFile method')
localPath = os.getcwd() + '\Files'
print(localPath)
for fileA in os.listdir(localPath):
print (fileA)
test = False
while test == False:
fileB = input('which file would you like to DeID? \n')
conjoinedPath = os.path.join(localPath, fileB)
test = os.path.isfile(conjoinedPath)
userInput = input('Please enter the number corresponding to which client ' + fileB + ' belongs to. \n\nAcceptable options are: \n1.A \n2.B \n3.C \n4.D \n5.E \n')
client = ''
if (userInput == '1'):
client = 'A'
elif (userInput == '2'):
client = 'B'
elif (userInput == '3'):
client = 'CServices'
elif (userInput == '4'):
client = 'D'
elif (userInput == '5'):
client = 'E'
return(client, conjoinedPath)
def main():
x, y = selectFile()
parse(y)
if __name__ == '__main__':
main()
All of it seems to be working as intended, but I am getting a:
TypeError: can't use a string pattern on a bytes-like object
when trying to open filename (line 3 in the code). I have tried to convert filename to to both a string-type and a byte-type and neither seem to work.
Here is the output:
>>>
start selectFile method
C:\PythonScripts\DeID\Files
89308570_201601040630verifyppn.txt
89339985_201601042316verifyppn.txt
which file would you like to DeID?
89339985_201601042316verifyppn.txt
Please enter the number corresponding to which client 89339985_201601042316verifyppn.txt belongs to.
Acceptable options are:
1.Client A
2.Client B
3.Client C
4.Client D
5.Client E
3
Traceback (most recent call last):
File "C:\PythonScripts\DeID\DeIDvA1.py", line 107, in <module>
main()
File "C:\PythonScripts\DeID\DeIDvA1.py", line 103, in main
parse(y)
File "C:\PythonScripts\DeID\DeIDvA1.py", line 63, in parse
dialect = csv.Sniffer().sniff(csvfile.read(), delimiters=';,|')
File "C:\Python34\lib\csv.py", line 183, in sniff
self._guess_quote_and_delimiter(sample, delimiters)
File "C:\Python34\lib\csv.py", line 224, in _guess_quote_and_delimiter
matches = regexp.findall(data)
TypeError: can't use a string pattern on a bytes-like object
>>>
I am not sure what I am doing wrong.
It is not the filename to be blamed here, but the fact you are opening the file with:
with open(filename, 'rb') as csvfile:
Where the 'rb'
mode specifies that the file will be opened in binary mode, that is, the contents of the file are treated as byte
objects. Documentation:
'b'
appended to the mode opens the file in binary mode: now the data is read and written in the form of bytes objects. This mode should be used for all files that don’t contain text.
Then you attempt to search within it with csv.Sniff().sniff()
with a string pattern, and, as the TypeError
gracefully points out, this isn't allowed.
Removing b
from the mode and simply using r
will do the trick.
Note: Python 2.x doesn't exhibit this behavior on Unix machines. This is a result of the segregation of bytes
and str
objects as distinct types in 3.x
.