Search code examples
regexfindpcretext-editorregex-look-ahead

Regular Expression Match string after n lines only if another string before n lines matches in any Text Editor supporting PCRE


I have a PHP Database configuration file(opened in Code Editor supporting PCRE) which contains various engine's credentials for generating connection strings.

Now as I mentioned in the question title, I want to match certain string "DB_DATABASE", only if before n lines(let's say 3) there's presence of 'mysql' like below:

....
....
'mysql' => [
    'driver' => 'mysql',
    'url' => env('DATABASE_URL'),
    'host' => env('DB_HOST', '127.0.0.1'),
    'port' => env('DB_PORT', '3306'),
    'database' => env('DB_DATABASE', 'anydb'),
....
....

If I say before 3 lines then it should not match the "mysql" before 4 lines i.e. it should not match 'mysql' => [. Note that there are other DB engine connetion details besides the one mentioned here, so need to match mysql and only when it is present exactly 3 lines above DB_DATABASE.

I tried some complex regexes but none gives me what I want, let me mention one which I thought was atleast close, but in vain:

^[\S\s]*?mysql\S*$(?:\n.*){3}\S*DB_DATABASE

Appreciate anyone putting efforts to help resolve this...


Solution

  • I would suggest matching the words as whole words and use the \K operator to only consume the DB_DATABASE string:

    \bmysql\b.*(?:\n.*){3}\n.*\K\bDB_DATABASE\b
    

    See the regex demo. Details:

    • \bmysql\b - a whole word mysql
    • .* - the rest of the line
    • (?:\n.*){3} - three lines
    • \n - a newline
    • .* - any zero or more chars other than line break chars, as many as possible
    • \K - match reset operator that removes the text matched so far from the match memory buffer
    • \bDB_DATABASE\b - a whole word DB_DATABASE.

    NOTE: If your line endings are mixed or unknown, you should replace all occurrences of \n with \R in the pattern (\R matches any line ending sequence).