Search code examples
regexperlwildcardmultilinenon-greedy

Regex to non-greedily match across multiple lines up to a line that starts with a specific string


I am going to answer this myself, but this was giving me fits all day and although it is explained elsewhere, I thought I'd post it with my solution.

I came across a situation where I needed to replace some text spanning multiple lines. It wasn't hard to find threads about how to match across multiple lines. My case was a bit more difficult in that I needed to wildcard match any character across multiple lines, until stopping at the first non-indented closing bracket.

For demonstration purposes, I made a sample file that has the features that made this hard for me:

starting file:

cat << EOF > test.txt
server {
    abcdefg blablablabla
    pizza
    #blablablabla
    blablablabla {
    zazazazazaza
    }
    turtles
    #}
    ninjas
    blablablabla

} #comments that might or might not be here

server {
    blablablabla
    blablablabla
    blablablabla
    blablablabla
}

zabzazab

EOF

This was my desired output. Note that the bracket I am matching to is neither the first nor the last occurrence of the closing bracket. Its only distinguishing feature is that being the first } at the beginning of a line after the start of my match:

server {
    wxyz

server {
    blablablabla
    blablablabla
    blablablabla
    blablablabla
} 

zabzazab

What I hoped would work. But slupring with 0777 strips out the markers for the beginning and end of a line, so it didn't work:

~#  perl -0777 -pe 's/(abcdefg(.*?)(^}.*$))/wxyz/gs' test.txt
server {
    abcdefg blablablabla
    pizza
    #blablablabla
    blablablabla {
    zazazazazaza
    }
    turtles
    #}
    ninjas
    blablablabla

} #comments that might or might not be here

server {
    blablablabla
    blablablabla
    blablablabla
    blablablabla
}

zabzazab

Matching the line start/end while also slupring was sticking point:

~# perl -0777 -pe 's/(abcdefg(.*?)(}))/wxyz/gs' test.txt
server {
    wxyz
    turtles
    #}
    ninjas
    blablablabla

} #comments that might or might not be here

server {
    blablablabla
    blablablabla
    blablablabla
    blablablabla
}

zabzazab


So is there a way I can get a regex to match between a string and the first instance of a { that appears at the beginning of a line? I'm open to using sed too, but I figured the non-greedy nature of my search would make perl a better choice.


Solution

  • Perhaps any of following command will do it

    perl -0777 -pe 's/abcdefg.*?(\nserver.*?)/wxyz\n$1/s' test.txt
    perl -0777 -pe 's/abcdefg.*?server/wxyz\n\nserver/s' text.txt
    perl -0777 -pe 's/abcdefg.*?}.*?}.*?}.*?\n/wxyz\n/s' test.txt
    perl -0777 -pe 's/abcdefg(.*?}){3}.*?\n/wxyz\n/s' test.txt
    perl -0777 -pe 's/abcdefg.*?\n}.*?\n/wxyz\n/s' test.txt
    

    Output

    server {
        wxyz
    
    server {
        blablablabla
        blablablabla
        blablablabla
        blablablabla
    }
    
    zabzazab