Search code examples
python-2.7tkintertext-editortext-widget

Cannot count the number of lines in a text file in python


Recently I have been working on a GUI python plain text editor. The code calls the following function that is supposed to count the number of lines that have been entered into the text widget:

def numlines():
    targetline = textPad.get(1.0, END)
    targetline.split()
    lines = 0
    for line in targetline:
        lines += 1
    return lines

The code runs, however it does not give me the correct number of lines in the file. In fact, it appears the the number of letters per line and spaces affect the line count number (ex: On two lines I enter two letters. I get 6 lines). I can not find an explanation for this issue.

I am running on Windows 7 with python 2.7.9 and tkinter.


Solution

  • Replace:

    targetline.split()
    lines = 0
    for line in targetline:
        lines += 1
    return lines
    

    With:

    return len(targetline.split('\n'))
    

    Example

    Let us create an example targetline:

    >>> targetline="line 1\nline 3\nline 3"
    >>> print targetline
    line 1
    line 3
    line 3
    

    Now, let's count the lines:

    >>> len(targetline.split('\n'))
    3
    

    An issue that I haven't addressed here is what textPad.get actually returns. Does it use \n for a line delimiter? Does it provide a final \n on the last line? These are issues that you will need to address before the code is correct.

    Discussion

    Consider this line:

    targetline.split()
    

    Let's apply that to our example from above:

    >>> targetline.split()
    ['line', '1', 'line', '3', 'line', '3']
    

    There are two things to note:

    1. It splits, by default, on all whitespace.

    2. It returns a list. It does not alter targetline.

    Now consider:

    lines = 0
    for line in targetline:
        lines += 1
    return lines
    

    Since targetline is unaltered by split, the for loop loops over every character in targetline. Consequently, this loop counts every character in targetline.

    It just want the length of an object, such looping is not needed. The len function is simpler.