Search code examples
phpcommonmark

How to parse nested containers with Commonmark for PHP?


I am attempting to create a spoiler block using League’s CommonMark package.

The block is opened by three inverted exclamation marks, optionally followed by a summary; three normal exclamation marks conclude the block.

This is the code I have so far:

Element

<?php
use League\CommonMark\Block\Element\AbstractBlock;
use League\CommonMark\Cursor;
class Spoiler extends AbstractBlock {
    private $summary;
    public function __construct($summary = null) {
        parent::__construct();
        $this->summary = $summary;
    }
    public function getSummary() { return $this->summary; }
    public function canContain(AbstractBlock $block) { return true; }
    public function acceptsLines() { return true; }
    public function isCode() { return false; }
    public function matchesNextLine(Cursor $cursor) {
        if ($cursor->match('(^!!!$)')) {
            $this->lastLineBlank = true;
            return false;
        }
        return true;
    }
}

Parser

<?php
use League\CommonMark\Block\Parser\AbstractBlockParser;
use League\CommonMark\ContextInterface;
use League\CommonMark\Cursor;
class SpoilerParser extends AbstractBlockParser {
    public function parse(ContextInterface $context, Cursor $cursor) {
        if ($cursor->isIndented()) return false;

        $previousState = $cursor->saveState();
        $spoiler = $cursor->match('(^¡¡¡(\s*.+)?)');
        if (is_null($spoiler)) {
            $cursor->restoreState($previousState);
            return false;
        }

        $summary = trim(mb_substr($spoiler, mb_strlen('¡¡¡')));
        if ($summary !== '') {
            $context->addBlock(new Spoiler($summary));
        } else {
            $context->addBlock(new Spoiler());
        }

        return true;
    }
}

Renderer

<?php
use League\CommonMark\Block\Element\AbstractBlock;
use League\CommonMark\Block\Renderer\BlockRendererInterface;
use League\CommonMark\ElementRendererInterface;
use League\CommonMark\HtmlElement;
class SpoilerRenderer implements BlockRendererInterface {
    public function render(AbstractBlock $block, ElementRendererInterface $htmlRenderer, $inTightList = false) {
        if (!($block instanceof Spoiler)) throw new \InvalidArgumentException('Incompatible block type: ' . get_class($block));
        $summary = new HtmlElement('summary', [], $block->getSummary() ?: 'Click to expand spoiler');
        $content = $summary . "\n" . $htmlRenderer->renderBlocks($block->children());
        return new HtmlElement('details', [], $content);
    }
}

The problem occurs when I nest multiple spoiler blocks: the first terminator closes all the blocks.

¡¡¡
1
¡¡¡
2
¡¡¡
Hello
!!!
3
!!!
4
!!!

This is the parsed AST:

League\CommonMark\Block\Element\Document
    App\Helpers\Formatting\Element\Spoiler
        League\CommonMark\Block\Element\Paragraph
            League\CommonMark\Inline\Element\Text "1"
        App\Helpers\Formatting\Element\Spoiler
            League\CommonMark\Block\Element\Paragraph
                League\CommonMark\Inline\Element\Text "2"
            App\Helpers\Formatting\Element\Spoiler
                League\CommonMark\Block\Element\Paragraph
                    League\CommonMark\Inline\Element\Text "Hello"
    League\CommonMark\Block\Element\Paragraph
        League\CommonMark\Inline\Element\Text "3"
        League\CommonMark\Inline\Element\Newline
        League\CommonMark\Inline\Element\Text "!!!"
        League\CommonMark\Inline\Element\Newline
        League\CommonMark\Inline\Element\Text "4"
        League\CommonMark\Inline\Element\Newline
        League\CommonMark\Inline\Element\Text "!!!"

This is the expected AST:

League\CommonMark\Block\Element\Document
    App\Helpers\Formatting\Element\Spoiler
        League\CommonMark\Block\Element\Paragraph
            League\CommonMark\Inline\Element\Text "1"
        App\Helpers\Formatting\Element\Spoiler
            League\CommonMark\Block\Element\Paragraph
                League\CommonMark\Inline\Element\Text "2"
            App\Helpers\Formatting\Element\Spoiler
                League\CommonMark\Block\Element\Paragraph
                    League\CommonMark\Inline\Element\Text "Hello"
            League\CommonMark\Block\Element\Paragraph
                League\CommonMark\Inline\Element\Text "3"
        League\CommonMark\Block\Element\Paragraph
            League\CommonMark\Inline\Element\Text "4"

Solution

  • In this scenario, matchesNextLine() will always run on the top-level Spoiler based on how DocParser::resetContainer() iterates through the AST. Instead, I'd recommend using SpoilerParser::parse() to check for the ending syntax. For example, you could add something like this inside your existing parser:

    if ($cursor->match('/^!!!$/')) {
        $container = $context->getContainer();
        do {
            if ($container instanceof Spoiler) {
                $context->setContainer($container);
                $context->setTip($container);
                $context->getBlockCloser()->setLastMatchedContainer($container);
                return true;
            }
        } while ($container = $container->parent());
    }
    

    This seems to produce the expected output:

    <details><summary>Click to expand spoiler</summary>
    <p>1</p>
    <details><summary>Click to expand spoiler</summary>
    <p>2</p>
    <details><summary>Click to expand spoiler</summary>
    <p>Hello</p></details>
    <p>3</p></details>
    <p>4</p></details>
    <p></p>
    

    Disclaimer: While the AST is probably correct based on this output, I did not verify the AST itself. I also didn't check whether my suggestion negatively impacts the parsing process, potentially causing issues with other elements or deeper nesting, so you may want to trace through that. But this general approach (parsing !!! in the parser and manipulating the context/AST) is probably your best option.

    I hope that helps!