Search code examples
regexsedgit-show

How do I match all but the first matches in a line with sed?


I'm doing my commit messages in Git with a certain pattern to ease creation of a changelog for new releases (https://stackoverflow.com/a/5151123/520162).

Every change that should be taken into my changelog gets prefixed with CHG, NEW or FIX.

When it comes to generation of my changelog, I print out the revisions I'm going to parse using the following command for each revision:

git show --quiet --date=short --pretty=format:"%cd %an %s%n%n%w(100,21,21)%b%n" $CURRENTREVISION

The subject (%s) holds the subject of the modification.

Next, I'm using SED to modify the generated outputs so that they fit the needs of my changelog file.

Now, it happens that in the subject line, there are multiple occurrences of CHG, NEW or FIX. My output of the subject looks like this:

DATE NAME FIX first change NEW second change CHG third change

I'd like to prefix all but the first occurrence of my keywords with a newline so that each CHG, NEW or FIX starts a new line:

DATE NAME FIX first change
          NEW second change
          CHG third change

What do I have to tell SED in order to achieve this?


Solution

  • sed isn't the most appropriate tool for this

    With awk it would look like this.

    awk '{n=0; for (i=1; i<=NF; i++) {if ($i ~ /(NEW|FIX|CHG)/) {$i=(n++?"\n          ":"")$i}}}7'
    
    • n=0 (re)set a flag
    • for (i=1; i<=NF; i++) loop over every field of the line
    • if ($i ~ /(NEW|FIX|CHG)/) if the field is one of the markers
      • $i=(n++?"\n ":"")$i update the field by adding the appropriate leading space (or none)
    • 7 truth-y pattern to print out the current line.