Search code examples
sqlregexmariadbmediawikirecursive-query

Multiline regex match in MariaDB/Mediawiki


I am trying to match text (contained in a Mediawiki template) in multiple lines via the Replace Text extension in MW 1.31, server running MariaDB 10.3.22.

An example of the template is the following (other templates may exist on the same page):

{{WoodhouseENELnames
|Text=[[File:woodhouse_999.jpg|thumb|link={{filepath:woodhouse_999.jpg}}]]Αἰακός, ὁ, or say, son of Aegina.

<b class="b2">Of Aeacus</b>, adj.: Αἰάκειος.

<b class="b2">Descendant of Aeacus</b>: Αἰακίδης, -ου, ὁ.
}}

Above and below could be other templates, with varying number of line breaks I.e.

{{MyTemplatename
|Text=text, text, text
}}
{{WoodhouseENELnames
|Text=text, text, text
}}
{{OtherTemplatename
|Text= text, text, text
}}

There are varying number of lines and/or line breaks within the template. I want to match the full template and delete it; that is match from {{WoodhouseENELnames to the closing }} but without matching any templates further down, that is, stop matching if further {{ are encountered.

The closest I got was using something like:

Find ({{WoodhouseENELnames\n\|Text=)(.*?)\n+(.*?)\n+(.*?)\n+(.*?)(\n+}})

And adding/removing (.*?)\n+ in the regex to match cases with more or less lines. The problem is that this expression might inadvertently match other templates following this one.

Is there a regex that would match all possible text/line breaks contained within the template (in a lazy way, as there may be other templates below and above) in the same page? The templates are delimited by opening {{ and closing }})?


Solution

  • Edited to clear up any confusing


    This is a recursion simulation for use on
    Java, Python style engines that do not support function calls (recursion)

    (?s)(?={{WoodhouseENELnames)(?:(?=.*?{{(?!.*?\1)(.*}}(?!.*\2).*))(?=.*?}}(?!.*?\2)(.*)).)+?.*?(?=\1)(?:(?!{{).)*(?=\2$)

    Recursion Simulation demo

    Just check matchs for result


    This is real recursion for use on Perl, PCRE style engines

    (?s){{WoodhouseENELnames((?:(?>(?:(?!{{|}}).)+)|{{(?1)}})*)}}

    Recursion demo


    Note that Dot-Net is done differently and is not included here