I get this error
Traceback (most recent call last):
File "C:\Users\Anthony\PycharmProjects\ReadFile\main.py", line 14, in <module>
masterFile.write("Line {}: {}\n".format(index, line.strip()))
File "C:\Users\Anthony\AppData\Local\Programs\Python\Python39\lib\encodings\cp1252.py", line 19, in encode
return codecs.charmap_encode(input,self.errors,encoding_table)[0]
UnicodeEncodeError: 'charmap' codec can't encode characters in position 8-13: character maps to undefined
The program is supposed to search for all txts in a directory and search them for a specific word. Once it finds it print them to a file with the line and then also print another copy of the file with full line numbers. There will be like 100 txt files and it will work on the first 3 before I get this error message. All the files are UTF-8 encoded. I tried changing
with open(file, encoding="utf-8") as f:
but it didn't work.
import glob
searchWord = "Hello"
dataFile = open("C:/Users/Anthony/Documents/TextDataFolder/TextData.txt", 'w')
masterFile = open("C:/Users/Anthony/Documents/TextDataFolder/masterFile.txt", 'w')
files = glob.iglob("#C:/Users/Anthony/Documents/Texts/*.txt", recursive = True)
for file in files:
with open(file) as f:
print(file)
for index, line in enumerate(f):
#print("Line {}: {}".format(index, line.strip()))
masterFile.write("Line {}: {}\n".format(index, line.strip()))
if searchWord in line:
print("Line {}: {}".format(index, line.strip()))
dataFile.write("Line {}: {}\n".format(index, line.strip()))
I eventually figured it out... I feel like an idiot. The problem wasn't my reading of the files. It was my writing wasn't encoded. had only attempted to encoding my read. So Final Looks like this
import glob
searchWord = "Hello"
dataFile = open("C:/Users/Anthony/Documents/TextDataFolder/TextData.txt", 'w', encoding="utf-8")masterFile = masterFile = open("C:/Users/Anthony/Documents/TextDataFolder/masterFile.txt", 'w', encoding="utf-8")
files = glob.iglob("#C:/Users/Anthony/Documents/Texts/*.txt", recursive = True)
for file in files:
with open(file, "r", encoding="utf-8") as f:
print(file)
for index, line in enumerate(f):
#print("Line {}: {}".format(index, line.strip()))
masterFile.write("Line {}: {}\n".format(index, line.strip()))
if searchWord in line:
print("Line {}: {}".format(index, line.strip()))
dataFile.write("Line {}: {}\n".format(index, line.strip()))