Search code examples
gitbashshellxargsgit-log

How to run command on git log output per commit using bash


On our project we use git commit messages to add certain metadata to our commits. Most obvious ones are JIRA ids but there are more. What I am trying to do is to process commit messages to extract this data and do something with it but am having trouble with how to run a script per one message and not per line. For example here is what I am trying to do:

# Extract JIRA issue ids for last 5 commits
git log --format=format:%B --no-merges -n5 | while read line ; do echo $line | grep -oP "(?<=JIRA: )[^ ]+"; done

#Output 
456
128
756

this does work in this case but the problem is that using while here is run per line and not per one message of log output so grep is ran too many times. This becomes a problem when I try to do something like:

 # Extract JIRA issue id and Gerrit change id from last 5 commits
 git log --format=format:%B --no-merges -n5 | while read line ; do echo $line | grep -oP "(?<=Change-Id: )[^ ]+|(?<=JIRA: )[^ ]+"; done

#Output
Ida3e220cdfa80ace5164109916fb5015d7aaaaaa
Ic4eed79f8acf5bf56f848bf543168e4ac9aaaaaa
456
I51dc621df6f54539f05053f6e036cc97a7aaaaaa
128
Ic04fa3de9b5e453358292bc7b965139707aaaaaa
756
I453a99155dacdc693ee28f248e92a6ccc8aaaaaa

As you can see each match from grep is printed on seperate lien but I would like to get output in the format:

Change-Id JIRA

Ida3e220cdfa80ace5164109916fb5015d7aaaaaa
Ida3e220cdfa80ace5164109916fb5015d7aaaaaa 128 Ic04fa3de9b5e453358292bc7b965139707aaaaaa 756

Do note that some commits don't have Jira Id and this should be reflected by empty string or " " like shown for first commit.

I also tried using

# Extract JIRA ids using bash function and xargs
git log --format=format:%B --no-merges -n5 | xargs -L1 gitLogExtractJIRA

But this gives me the "xargs: gitLogExtractJIRA: No such file or directory" error

I also looked into using https://git-scm.com/docs/git-for-each-ref especially the --shell subcommand but think it is not applicable in this case since I am iterating over log messages not refs. I might be wrong on this thou since I haven't used for-each-ref much.

So my question is how to run a bash function (or inline group of commands) per one message of git log output and not per line?


Solution

  • I managed to do what I wanted but I think approach is quite terrible and would definitively want to do it more optimally:

    # Extract ChangeId and JIRA from last 5 commits
    for i in {0..4}; do output="`git log --format=format:%B --no-merges -n1 --skip=$i`";echo `parseChangeId "$output"` `parseJira "$output"`;done;
    # Output
    Ida3e220cdfa80ace5164109916fb5015d7aaaaaa
    Ida3e220cdfa80ace5164109916fb5015d7aaaaaa 128 
    Ic04fa3de9b5e453358292bc7b965139707aaaaaa 756
    Ic04fa3de9b5e453358292bc7b965139707aaaaaa
    Ic04fa3de9b5e453358292bc7b965139707aaaaaa 456
    

    Main problem about this is that I am calling git binary 5 times instead of once and although it is quite fast this is a waste, echo is also called 5 times which might be unnecessary calls to parseChangeId and parseJira (those are just a wrapper around grep) with their current implementation must be called 5 times anyway.

    Also I would like to add subject from git log to this output but only way to do that with this approach is to add another call to git log with %s format.