import os
def countlines(start, lines=0, header=True, begin_start=None):
if header:
print('{:>10} |{:>10} | {:<20}'.format('ADDED', 'TOTAL', 'FILE'))
print('{:->11}|{:->11}|{:->20}'.format('', '', ''))
for thing in os.listdir(start):
thing = os.path.join(start, thing)
if os.path.isfile(thing):
if thing.endswith('.py'):
with open(thing, 'r') as f:
newlines = f.readlines()
newlines = list(filter(lambda l: l.replace(' ', '') not in ['\n', '\r\n'], newlines))
newlines = list(filter(lambda l: not l.startswith('#'), newlines))
newlines = len(newlines)
lines += newlines
if begin_start is not None:
reldir_of_thing = '.' + thing.replace(begin_start, '')
else:
reldir_of_thing = '.' + thing.replace(start, '')
print('{:>10} |{:>10} | {:<20}'.format(
newlines, lines, reldir_of_thing))
for thing in os.listdir(start):
thing = os.path.join(start, thing)
if os.path.isdir(thing):
lines = countlines(thing, lines, header=False, begin_start=start)
return lines
countlines(r'/Documents/Python/')
If we take the standard Python file .main.py, then there are 4 lines of code in it. And he counts as 5. How to fix it? How to properly set up a filter so that it does not count empty lines of code and comments?
1. You can modify your first filter
condition: strip
the line, and then check that it isn't empty.
lambda l: l.replace(' ', '') not in ['\n', '\r\n']
becomes
lambda l: l.strip()
2. filter
takes any iterable, so no need to convert it to lists every time - this is a waste because it forces two sets of iterations - one when you create the list, another when you filter it a second time. You could remove the calls to list()
and only do it once after all your filtering is done. You can also use filter
on the file handle itself, since the file handle f
is an iterable that yields lines from the file in every iteration. This way, you only iterate over the entire file once.
newlines = filter(lambda l: l.strip(), f)
newlines = filter(lambda l: not l.strip().startswith('#'), newlines)
num_lines = len(list(newlines))
Note that I renamed the last variable, because a variable name should describe what it is
3. You can combine both your filter condition into a single lambda
lambda l: l.strip() and not l.strip().startswith('#')
or, if you have Python 3.8+,
lambda l: (l1 := l.strip()) and not l1.startswith('#')
This makes my point #2 about not list
ing out the above moot -
num_lines = len(list(filter(lambda l: (l1 := l.strip()) and l1.startswith('#'), f)))
With the following input, this gives the correct line count:
file.py:
print("Hello World")
# This is a comment
# The next line is blank
print("Bye")
>>> with open('file.py') as f:
... num_lines = len(list(filter(lambda l: (l1 := l.strip()) and l1.startswith('#'), f)))
... print(num_lines)
Out: 2