Search code examples
pythondivide-by-zero

How to find and extract values from a txt file?


Write a program that prompts for a file name, then opens that file and reads through the file, looking for lines of the form:

X-DSPAM-Confidence: 0.8475

Count these lines, extract the floating point values from each of the lines, and compute the average of those values and produce an output as shown below. Do not use the sum() function or a variable named sum in your solution.*

This is my code:

fname = input("Enter a file name:",)
fh = open(fname)
count = 0
# this variable is to add together all the 0.8745's in every line
num = 0
for ln in fh:
    ln = ln.rstrip()
    count += 1
    if not ln.startswith("X-DSPAM-Confidence:    ") : continue
    for num in fh:
        if ln.find(float(0.8475)) == -1:
            num += float(0.8475)
        if not ln.find(float(0.8475)) : break
    # problem: values aren't adding together and gq variable ends up being zero
gq = int(num)
jp = int(count)
avr = (gq)/(jp)
print ("Average spam confidence:",float(avr))

The problem is when I run the code it says there is an error because the value of num is zero. So I then receive this:

ZeroDivisionError: division by zero

When I change the initial value of num to None a similar problem occurs:

int() argument must be a string or a number, not 'NoneType'

This is also not accepted by the python COURSERA autograder when I put it at the top of the code:

from __future__ import division

The file name for the sample data they have given us is "mbox-short.txt". Here's a link http://www.py4e.com/code3/mbox-short.txt


Solution

  • I edited your code like below. I think your task is to find numbers next to X-DSPAM-Confidence:. And i used your code to identify the X-DSPAM-Confidence: line. Then I splitted the string by ':' then I took the 1st index and I converted to float.

    fname = input("Enter a file name:",)
    fh = open(fname)
    count = 0
    # this variable is to add together all the 0.8745's in every line
    num = 0
    for ln in fh:
        ln = ln.rstrip()
        if not ln.startswith("X-DSPAM-Confidence:") : continue
        count+=1 
        num += float(ln.split(":")[1])
    gq = num
    jp = count
    avr = (gq)/(jp)
    print ("Average spam confidence:",float(avr))