Search code examples
regexawksedteamcitypc-lint

Process text file by indented pattern


I've tried some combinations of sed with s/regex/../ but I was not successful. So here is my question: I have a text file which look something like this (PCLint output)

--- Module A
    Info: indented message 1
    Note: indented message 2
    Warning: indented message 3
--- Module B
--- Module C
    Info: indented message 1
--- Module D

I want to change the results to something like the following (teamcity service messages):

[Start Module="Module A"]
    [Message Content="Info: indented message 1"]
    [Message Content="Note: indented message 2"]
    [Message Content="Warning: indented message 3"]
[End Module="Module A"]
[Start Module="Module B"]
[End Module="Module B"]
[Start Module="Module C"]
    [Message Content="Info: indented message 1"]
[End Module="Module C"]
[Start Module="Module D"]
[End Module="Module D"]

So I know that the text is to be split somehow in blocks between each "--- ". Then I should wrap/substitute the text block with regex power. But I have no real clue how to eficiently do this. Ideally I like to use the tools available in busybox e.g. sed, awk, etc. to keep the tools "simple" (need to work on Win64).

Regex I can work with well, but I was not able to scope this. Any hints for me out there?


Solution

  • Awk can do this. You'll want one clause that matches /^---/ which sets a variable to record which module you're in, and also outputs the End line for the previous module (if any) and Start line for the next one. Then a second clause which outputs the message lines.

    $ cat input | awk '/^---/ { IFS=" "; oldM=M; M=$3; if (oldM) { print "[End Module=\"Module " oldM "\"]"; }; print "[Begin Module=\"Module " M "\"]"; } /^    (.*)$/ { gsub(/^ +/, "", $0); print "    [Message Content=\"" $0 "\"]"; } END { print "[End Module=\"Module " M "\"]"; }'
    [Begin Module="Module A"]
        [Message Content="Info: indented message 1"]
        [Message Content="Note: indented message 2"]
        [Message Content="Warning: indented message 3"]
    [End Module="Module A"]
    [Begin Module="Module B"]
    [End Module="Module B"]
    [Begin Module="Module C"]
        [Message Content="Info: indented message 1"]
    [End Module="Module C"]
    [Begin Module="Module D"]
    [End Module="Module D"]