Search code examples
perlsed

Add a backslash to specific lines per negative lookaround with sed or perl


On our productive Splunk environment we have over a hundred self developed use cases in a special app which we want to further export to clusters dedicated to other customers and the like. The savedsearches.conf is split in the default and local folders and we'd like to merge these beforehand via splunk btool --app=our_fancy_usecases savedsearches list > searches.conf. The resulting config file will thus look like this:

[Stanza #1]
some.config = some settings
description = just a few words
some.other.config = more settings
search = this is 
| a well
| formatted search
[Stanza #2]
some.config = 1
description = This is taking 
a lot of words
to explain
this time.
some.other.config = 2
search = a lot 
of searching 
to do
[This is yet another stanza]
the.end = nigh

The issue with that btool loses backslashes (\) from the original config files that show that a certain line is actually a multiline value. Every single setting could be a multiline value, but doesn't neccessarily need to be. When the backslashes are missing, Splunk will only read the configuration for that setting until the line end, i.e. for the first Stanza the search in Splunk would end up as only this is, lacking the following two lines. That said, the config file should look like this:

[Stanza #1]
some.config = some settings
description = just a few words
some.other.config = more settings
search = this is \
| a well\
| formatted search
[Stanza #2]
some.config = 1
description = This is taking \
a lot of words\
to explain\
this time.
some.other.config = 2
search = a lot \
of searching \
to do
[This is yet another stanza]
the.end = nigh

Sadly, btool lacks the functionality to keep these backslashes so I'm trying to get them back with sed or whatever tool that does the job.

I've first been looking at how to grab the lines I need to change with regex. A Stanza always comes in brackets [ ] and a setting will always have a whitespace followed by = and another whitespace before the actual values. I came up with this negative lookaround: \n(?!^[A-Za-z0-9._]+ = .*$|^\[.*\]) This seems to grab what I want to change quite well in regex101 for the config example at the top. #sed however does not allow lookarounds. #Perl should be an option but this is where I'm failing:

perl -pe 's/\n(?![A-Za-z0-9._]+ = .*$|\[.*\]$)/\\\n/' test.conf > test.out ; cat test.out

[Stanza #1]\
some.config = some settings\
description = just a few words\
some.other.config = more settings\
search = this is\
| a well\
| formatted search\
[Stanza #2]\
some.config = 1\
description = This is taking\
a lot of words\
to explain\
this time.\
some.other.config = 2\
search = a lot\
of searching\
to do\
[This is yet another stanza]\
the.end = nigh\

Sadly this doesn't work as expected at all and simply adds a backslash to every line. I've also fiddled around a bit with #sed:

sed '/^[A-Za-z0-9._]\+ = .*$\|^\[.*\]$/! s/.*/&\\/' test.conf > test.out ; cat test.out
[Stanza #1]
some.config = some settings
description = just a few words
some.other.config = more settings
search = this is
| a well\
| formatted search\
[Stanza #2]
some.config = 1
description = This is taking
a lot of words\
to explain\
this time.\
some.other.config = 2
search = a lot
of searching\
to do\
[This is yet another stanza]
the.end = nigh

Any ideas on how I can reach my goal?


Solution

  • perl  -ne 'chomp;
               print "\\" if ! /\S \s = \s \S | ^ \[ .* \] /x;
               print "\n" if $. != 1;
               print;
               END { print "\n" }' < file
    
    • -n reads the input line by line and runs the code for each line;
    • chomp removes the final newline from input;
    • the backslash is printed if the "next" line is not a setting or stanza;
    • a newline is printed except before the first line;
    • the "next" line is printed;
    • at the end, a newline is printed.

    The main trick is to decide whether to print the backslash when looking at the next line. To achieve that, we don't print the newline after the current line, so when we read the next line in, we still can print the backslash.