Search code examples
pythoncountreadfile

ignore multiline comments in text file when reading in python


i'm trying to count how many lines of code are in multiple text files in a directory using a python script. I've come up with the following method, but it only works if the comment is on one line not multiline. is there a way to do this?

def remove_comments(line):
    if line.startswith('/*') or line.endsswith('*/'):
        return 0
    else:
        return 1

count = sum(remove_comments(line) for line in f if line.strip())

Solution

  • A dirty hack could be to use a global variable:

    with open("test", 'r') as f_in:
        f = f_in.readlines()
    
    is_in_comment = False
    
    def remove_comments(line):
        global is_in_comment
        line = line.strip()
    
        if line.startswith('/*'):
            is_in_comment = True
            return 0
        elif line.endswith('*/'):
            is_in_comment = False
            return 0
    
        return 0 if is_in_comment else 1
    
    count = sum(remove_comments(line) for line in f if line.strip())
    

    This assumes you can't have a */ without previous /* however. This code returns 3 for the following test file:

    That is one line
    Another
    /* Comment
    Other comment
    End comment */
    Final line, not a comment