Search code examples
python-3.xdictionarydata-sciencenested-loops

Why is only half my data being passed into my dictionary?


When I run this script I can verify that it loops through all of the values, but not all of them get passed into my dictionary

file = open('path', 'rb')
readFile = PyPDF2.PdfFileReader(file)

lineData = {}

totalPages = readFile.numPages

for i in range(totalPages):
    pageObj = readFile.getPage(i)
    pageText = pageObj.extractText
    newTrans = re.compile(r'Jan \d{2,}')
    for line in pageText(pageObj).split('\n'):
        if newTrans.match(line):
            newValue = re.split(r'Jan \d{2,}', line)
            newValueStr = ' '.join(newValue)
            newKey = newTrans.findall(line)
            newKeyStr = ' '.join(newKey)
            print(newKeyStr + newValueStr)
            lineData[newKeyStr] = newValueStr
print(len(lineData))

There are 80+ data pairs but when I run this the dict only gets 37


Solution

  • Well, duplicate keys, maybe? Try to make lineData = [] and append there: lineData.append({newKeyStr:newValueStr} and then check how many records you get.