I would like to extract all added comment lines for a specific file. In order to do this I extract all the comments with tokenize and ast.
Additionally, I would get all the added lines for this file from git show commit -- pathfile
.
I am having troubles to get the added comment lines, especially if they are just empty lines. My matching code looks like this:
addedCommentLinesPerFile = []
for commentline in parsedCommentLines:
for line in addedLinesList:
if commentline == line or commentline in line:
try:
parsedCommentLines.remove(commentline)
addedLinesList.remove(line)
except ValueError:
continue
addedCommentLinesPerFile.append(commentline)
Let's say my file would like this:
def function():
+ print("hello") #prints hello
+
"""
foo
"""
So the lists would look like this:
parsedCommentLines = ["#prints hello","foo",""]
addedLinesList = [' print("hello") #prints hello',""]
The desired output would be:
addedCommentLinesPerFile = ["#prints hello"]
But I would get:
addedCommentLinesPerFile = ["#prints hello",""]
commentline in line
: will indeed always return True
if commentline is empty, and will also work regardless of line
.
If you want to first match the lines matching exactly then try to see if existing lines are subparts of the remaining lines, you could at least write two loops
the first one would only match if commentline == line:
, the second one if commentline in line:
you may want to check extra conditions on commentline
before checking commentline in line
: minimum length, non white characters ...
If you want to check if a # one line comment
sits at the end of a string, write that :
commentline
starts with a #
if line.endswith(commentline)
Another approach could be to generate two files which contain only the comment lines, and compare these two files to see how comments were modified.
On the git side of things :
to list the files affected by commit
, you can use :
git show --format="" --name-only commit # or --name-status
for each of the modified files, you can get :
the content of the file before :
git show commit~:path/to/file
the content of the file after :
git show commit:path/to/file
From these two contents, you can use your code to extract comments, and either
/tmp/comments.before
and /tmp/comments.after
) and just run diff /tmp/comments.before /tmp/comments.after
diff
like algorithm on two strings lists