Search code examples
javascriptregexobsidian

Regex to match markdown headings and text nested under specific heading


I am using Obsidian (which uses ECMAScript) with the Obsidian_to_Anki-Plugin and I have this page structure:

# Heading 1 ⤵
## Heading 1.1
Text of Heading 1.1
Text can span over multiple lines
Even more text
## Heading 1.2 
Text of Heading 1.2
# Heading 2
## Heading 2.1
Text of Heading 2.1
## Heading 2.2
Text of Heading 2.2
# Heading 3 ⤵
## Heading 3.1
Text of Heading 3.1
## Heading 3.2
Text of Heading 3.2
# Heading 4

I need a RegExp that matches all ## Headings and Text of Headings that are nested under # Heading ⤵. The should function as a kind of switch here. All ## Headings and Text of headings should be matched with capturing groups. So Content nested under # Heading without the should not be matched. Hence the matched text should be:

## Heading 1.1
Text of Heading 1.1
More text
Even more text
## Heading 1.2
Text of Heading 1.2
## Heading 3.1
Text of Heading 3.1
## Heading 3.2
Text of Heading 3.2

Here's what I came up with regex101. My problem is, that this way only the first ## headings and texts get matched and I can't find a solution.


Solution

  • You might use:

    (?<=^# .*⤵(?:\n(?!# ).*)*)\n(^## .*)\n(?!^##? )(.*(?:\n(?!^##? ).*)*)
    

    The pattern matches:

    • (?<= Positive lookbehind, assert that to the left is
      • ^# .*⤵ Match # and the rest of the line ending on
      • (?:\n(?!# ).*)* Optionally match all lines that do not start with 1+ # chars and a space
      • \n Match a newline
    • ) Close the lookbehind
    • (^## .*) Capture group 1, match ## followed by the rest of the line
    • \n Match a newline
    • (?!^##? ) Negative lookahead, assert that the line does not start with # or ## and a space
    • ( Capture group 2
      • .* Match the whole line
      • (?:\n(?!^##? ).*)* Optionally match all lines that do not start with # or ## and a space
    • ) Close group 2

    Regex demo