I am an amateur and am playing around with writing my own (very bad) compression tool, just for fun. The following code is used to read a text file and create a dictionary of the indexes of every character in the file. I'm trying to read the file in 1K chuncks, just for the hell of it but for some reason I get an infinite loop. I'm guessing I've misunderstood something in the "Iter" method.
code:
def dictify(myFile):
compDict = {}
count = 0
with open(myFile, 'r') as f:
for chunk in iter(f.read, 1024):
for ch in chunk:
if ch in compDict:
compDict[ch].append(count)
else:
compDict[ch] = []
compDict[ch].append(count)
count += 1
print(compDict)
print(compDict)
dictify('test.txt')
the print statement was for debugging purposes and I left it in because it will make it clear to whomever runs the code where the inf. loop is. Also - the txt file can be anything. Mine just says "I am the walrus"
any ideas what I'm doing wrong? Thanks!
this is not how iter
works.
your example is given in the doc as:
from functools import partial
with open('mydata.db', 'rb') as f:
for block in iter(partial(f.read, 64), b''):
process_block(block)
if you use iter
with 2 arguments, the first must be a callable and the second a sentinel; i.e. something to look for when iter
needs to terminate.
in your case the second argument is an integer (1024
); f.read
returns a string; so it will never terminate.
if you read your file in text mode (as opposed to binary) you need to make the following changes (i also adapted your block size):
with open('mydata.db', 'r') as f:
for block in iter(partial(f.read, 1024), ''):
process_block(block)