As a hobby project and as a learning exercise, I decided to implement a software lines of code measurement script in Python.
However, I have a question:
Please note that I am aware many tools exist out there and perhaps better than mine, (sloccount
is one example), however I am doing this as a completely hobbyist program.
You wouldn't normally count the comments as a line of code - but that can be a useful metric by itself, so maybe you should keep a count of them as you parse through the file.
You are better off checking for lines that are not whitespace, and end with a CRLF with no line continuation char. In regex speak that would mean you want to avoid lines like this (assuming the backslash is your line continuation char):
\\\s*\n\r
if you find a line like that, don't increment the counter. Of course that regex may differ depending upon which language (engine) you are using, and using a regex may not even be the most appropriate way to do it - a simple state engine may be better.