Search code examples
javascriptregexmarkdownfrontendpagedown

Markdown parser supports code fence with three backticks, but it's crashed by a blankline


The Stack Exchange’s Markdown parser only allows four-space indents to represent code blocks, but many other Markdown converters support Code Fence with 3 backticks, such as CommonMark, Github flavored Markdown.

So I wanna add this feature to this Markdown.Converter.js, it works well with code block without any blankline. But if a blankline in the code block, this Fence will be crashed (captured screen as follow). Here is _DoCodeFence function for this feature:

function _DoCodeFence(text) { text = text.replace(/(^|[^\\`])(`{3,})(\n)(?!`)([^\r]*?[^`])\2(?!`)/gm, function (wholeMatch, m1, m2, m3, m4, m5) { var c = m4; c = c.replace(/^([ \t]*)/g, ""); c = c.replace(/[ \t]*$/g, ""); c = _EncodeCode(c); c = c.replace(/:\/\//g, "~P"); return m1 + "<pre><code>" + c + "</code></pre>"; } ); return text; }

enter image description here


Solution

  • You need to fix the regex the following way:

    function _DoCodeFence(text) {
        text = text.replace(/((?:^|[^\\])(?:\\{2})*)(`{3,})(\r?\n)(?!`)([^\r]*?[^`])\2(?!`)/gm,
            function (wholeMatch, m1, m2, m3, m4, m5) {
                var c = m4;
                c = c.replace(/^[^\S\r\n]+|[^\S\r\n]+$/g, "");
                //c = _EncodeCode(c);
                c = c.replace(/:\/\//g, "~P");
                return m1 + "<pre><code>" + c + "</code></pre>";
            }
        );
        return text;
    }
    console.log(_DoCodeFence("\\\\```\n code\n   here\n```\n\n```\nCODE\n\n   HERE\n```"));

    Regex details

    • ((?:^|[^\\])(?:\\{2})*) - Group 1: start of a line (^) or (|) any char other than \ ([^\\]) followed with 0+ sequences of 2 \ chars (it is used to make sure the ` is not escaped)
    • (`{3,}) - Group 2: three or more backticks
    • (\r?\n)(?!`) - Group 3: a line break (CRLF or LF) not followed with a backtick
    • ([^\r]*?[^`]) - Group 4: any 0+ chars other than CR as few as possible and then a char other than a backtick
    • \2(?!`) - Same value as captured in Group 2 not followed with a backtick.