I have ci-pipelines and there are a lot of before_scripts
sections. I would like to make a multiline regexp. I export all before script to my-ci-jobs.txt
with python script.
pcregrep -M 'before_script.*\n.*' my-ci-jobs.txt
"before_script": [
"yarn install"
"before_script": [
"yarn install"
"before_script": [
"yarn install"
"before_script": [
"yarn install"
"before_script": [
"yarn install"
"before_script": [
"yarn install"
"before_script": [
"yarn install"
"before_script": [
"yarn install"
"before_script": [
"yarn install"
"before_script": [
"yarn install"
"before_script": [
"yarn install"
This works fine, but sometimes, there are more lines in before script, so I would like to make regular that catch everything between before_script and first match of ],
. But when I implement it, it will catch the longest match. This is my command (I will not past here the result, it is the whole file till the last ],
):
pcregrep -M 'before_script.*(\n|.)*],' my-ci-jobs.txt
How can I make regexp to match first match? Is there a better way how to do a multiline regexp?
You almost never need (.|\n)
in a regular expression, there are better means to match any chars including line break chars.
To match any zero or more chars but ]
you may use [^]]*
pattern:
pcregrep -M 'before_script[^]]*]' file
If you need the first match only, add | head -1
:
pcregrep -M 'before_script[^]]*]' file | head -1
Pattern details
before_script
- some literal text[^]]*
- a negated bracket expression that matches any chars but a ]
char, 0 or more times, as many as possible (since *
is a greedy quantifier) (it matches line break chars, too, because you pass an -M
option to pcregrep
)]
- a literal ]
char (no need to escape it because ]
outside a character class is not special).