Search code examples
bashshellawksedsh

Parse a changelog and extract changes for a version


I have a changelog file in markdown which contains all changes between each version of my app like that :

## Version 1.0.6

* first change
* second change
* third change

## Version 1.0.5

* first foo change
* second foo change

## Version 1.0.4

* and so on...

What I want is to extract in a script the changes content for a version. For example I would to extract the changes for the version 1.0.5, so it should print :

* first foo change
* second foo change

The ideal way would be ./getVersionChanges version filename which would those 2 params :

version : the version to extract changes

filename : the filename to parse

How can I achieve this with sed, awk, grep, or whatever ?


Solution

  • A slightly more elaborate awk solution, which

    • exits once the block of interest has been printed,
    • ignores blank lines,
    • doesn't include the header line.
    awk -v ver=1.0.5 '
     /^## Version / { if (p) { exit }; if ($3 == ver) { p=1; next } } p && NF
    ' file
    

    As script getVersionChanges:

    #!/usr/bin/env bash
    
    awk -v ver="$1" '
     /^## Version / { if (p) { exit }; if ($3 == ver) { p=1; next } } p && NF
    ' "$2"
    

    Explanation:

    • Regex condition /^## Version / matches the header line of a block of lines with version-specific information, by looking for substring ## Version at the start (^) of the line and, if found, executing the associated code block ({ ... }):

      • if (p) { exit } exits (stops processing), if the p (print) flag has previously been set, because that implies that the block after the one of interest has been reached, i.e. that the block of interest has now been fully processed.

      • if ($3 == ver) { p=1; next } checks if the 3rd whitespace-separated field ($3) on the header line matches the given version number (passed via option -v ver=1.0.5 and therefore stored in variable ver) and, if so, sets custom variable p, which serves as a flag indicating whether to print a line, to 1 and moves on to the next line (next), so as not to print the header line itself.
        In other words: p containing 1 indicates for subsequent lines that the version-specific block of interest has been entered, and that its lines should (potentially) be printed.

    • Condition p && NF implicitly prints the line at hand if the condition matches, which is the case if the print flag p is set and (&&) the line at hand has at least one field (based on the number of fields being reflected in built-in variable NF), i.e. if the line is non-blank, thereby effectively skipping empty and all-whitespace lines in the block of interest.

      • Note that both operands of && use implicit Boolean logic: a value of 0 (which a non-initialized custom variable such as p defaults to) is implicitly false, whereas any nonzero value implicitly true.