Search code examples
gitgit-branchauthorlines-of-codecloc

Counting lines of code per author in a git repository


So I'm in a team with a few other programmers and need to get a lines-of-code count per author in our git repository. That doesn't just mean lines modified by author, because that would include blank and comment lines. Ideally, I would be able to make a new branch containing only the commits of a specific author (--author="BtheDestroyer" for myself) and then use cloc to get the comment line count and code line counts separately. I've tried using

git log --author="BtheDestroyer" --format=%H > mycommits
git checkout --orphan mycommits
tac mycommits| while read sha; do git cherry-pick --no-commit ${sha}; done

during the last line, however, I get a ton of the following errors:

filepath: unmerged (commit-id-1)
filepath: unmerged (commit-id-2)
error: your index file is unmerged.
fatal: cherry-pick failed

I'm also not sure if that will end up fastforwarding through other commits in the process. Any ideas?


Solution

  • Answering my own question:

    I ended up using git blame and a Bourne shell script to loop through different files in the source folder, convert them back to code using grep and cut, organize the output into temporary files, and then run cloc on that.

    Here's my shell script for anyone wanting to do something similar (I have it in ./Blame/ so change SOURCE appropriately!):

    #!/bin/bash
    #Name of user to check
    #  If you have multiple usernames, separate them with a space
    #  The full name is not required, just enough to not be ambiguous
    USERS="YOUR USERNAMES HERE"
    #Directories
    SOURCE=../source
    
    for USER in $USERS
    do
        #clear blame files
        echo "" > $USER-Blame.h
        echo "" > $USER-Blame.cpp
        echo "" > $USER-Blame.sh
        echo "Finding blame for $USER..."
        #C++ files
        echo "  Finding blame for C++ files..."
        for f in $SOURCE/*.cpp
        do
            git blame "$f" | grep "$USER" | cut -c 70- >> "$USER-Blame.cpp"
        done
        #Header files
        echo "  Finding blame for Header files..."
        for f in $SOURCE/*.h
        do
            git blame "$f" | grep "$USER" | cut -c 70- >> "$USER-Blame.h"
        done
        #Shell script files
        echo "  Finding blame for shell script files..."
        for f in ./GetUSERBlame.sh
        do
            git blame "$f" | grep "$USER" | cut -c 70- >> "$USER-Blame.sh"
        done
    done
    
    for USER in $USERS
    do
    #cloc
    echo "Blame for all users found! Cloc-ing $USER..."
    cloc $USER-Blame.* --quiet
    #this line is for cleaning up the temporary files
    #if you want to save them for future reference, comment this out.
    rm $USER-Blame.* -f
    done