I am trying to match content under a specific heading level when title contains a [[Wikilink]]
keyword (use case: Obsidian)
I want it to match lower level headings.
Example [[Wikilink]]
is in an H2, then match all below until h2 or higher or end of file
Difficulty: [[Wikilink]]
H-level is unknown. The regex should be able to parse multiple inconsistent files where [[Wikilink]]
could be H1, H2, H3, etc.
My current regex that fails when it encounters any heading:
(^#+ )[^\[]*?\[\[Wikilink]\][^\n]*?\n([\S\s]*?)(?1)
Sandbox: https://regex101.com/r/bLdifP/1
Somehow related to this question on SO: Regex to match markdown headings and text nested under specific heading
Try the following regex.
^(#+) .*\[\[Wikilink\]\].*$(?=([\S\s]*?)(?:^(?!\1#+)#+ |\z))
Regex in action: https://regex101.com/r/bLdifP/4
The regex can be broken down as follows.
^ the beginning of a "line"
--------------------------------------------------------------------
( group and capture to \1:
--------------------------------------------------------------------
#+ '#' (1 or more times (matching the most
amount possible))
--------------------------------------------------------------------
) end of \1
--------------------------------------------------------------------
' '
--------------------------------------------------------------------
.* any character except \n (0 or more times
(matching the most amount possible))
--------------------------------------------------------------------
\[ '['
--------------------------------------------------------------------
\[ '['
--------------------------------------------------------------------
Wikilink 'Wikilink'
--------------------------------------------------------------------
\] ']'
--------------------------------------------------------------------
\] ']'
--------------------------------------------------------------------
.* any character except \n (0 or more times
(matching the most amount possible))
--------------------------------------------------------------------
$ before an optional \n, and the end of a
"line"
--------------------------------------------------------------------
(?= look ahead to see if there is:
--------------------------------------------------------------------
( group and capture to \2:
--------------------------------------------------------------------
[\S\s]*? any character of: non-whitespace (all
but \n, \r, \t, \f, and " "),
whitespace (\n, \r, \t, \f, and " ")
(0 or more times (matching the least
amount possible)).
Match any character, including line
breaks.
--------------------------------------------------------------------
) end of \2
--------------------------------------------------------------------
(?: group, but do not capture, equivalent
to "(?>":
--------------------------------------------------------------------
^ the beginning of a "line"
--------------------------------------------------------------------
(?! look ahead to see if there is not:
--------------------------------------------------------------------
\1 what was matched by capture \1
--------------------------------------------------------------------
#+ '#' (1 or more times (matching the
most amount possible))
--------------------------------------------------------------------
) end of look-ahead
--------------------------------------------------------------------
#+ '#' (1 or more times (matching the
most amount possible))
--------------------------------------------------------------------
' '
--------------------------------------------------------------------
| OR
--------------------------------------------------------------------
\z the end of the string
--------------------------------------------------------------------
) end of grouping
--------------------------------------------------------------------
) end of look-ahead