Search code examples
pycharmmarkdown

PyCharm README.md doesn't escape * with \


Do this in a README.md file

In A\*68sff

The preview is (proof):

In A\*68sff

It should be:

In A*68sff

On GitHub, the preview is correct, the preview is (proof):

In A*68sff

I'm using:

PyCharm 2022.3.1 (Professional Edition)
Build #PY-223.8214.51, built on December 20, 2022
Licensed to **********************
Subscription is active until May 13, 2023.
For educational use only.
Runtime version: 17.0.5+1-b653.23 amd64
VM: OpenJDK 64-Bit Server VM by JetBrains s.r.o.
Windows 11 10.0
GC: G1 Young Generation, G1 Old Generation
Memory: 2030M
Cores: 16
Non-Bundled Plugins:
    com.chesterccw.excelreader (2022.12.1-203.223)
    com.github.copilot (1.1.38.2229)
    me.lensvol.blackconnect (0.5.0)

Solution

  • When you say:

    It should be:

    In A*68sff
    

    In fact it shouldn't, or at least not necessarily. This is what's called an "ambiguity" in the Markdown specification. Lets look carefully at the original Markdown spec:

    Emphasis

    Markdown treats asterisks (*) and underscores (_) as indicators of emphasis. Text wrapped with one * or _ will be wrapped with an (...)

    And that's the "ambiguity" in your example, because A*68sff is not wrapped in asterisks, it only contains one single asterisk and the original Markdown specification has an omission on how an unwrapped asterisk should be treated if escaped in this case.

    What's happening is that GitHub and PyCharm use different implementations (parsers) of the Markdown specification that solve the ambiguities differently, the subsequent CommonMark specification starts by summing this up:

    Why is CommonMark needed?

    John Gruber’s canonical description of Markdown’s syntax does not specify the syntax unambiguously.

    (...)

    Because there is no unambiguous spec, implementations have diverged considerably over the last 10 years. As a result, users are often surprised to find that a document that renders one way on one system (say, a GitHub wiki) renders differently on another (say, converting to docbook using Pandoc). To make matters worse, because nothing in Markdown counts as a “syntax error,” the divergence often isn’t discovered right away.


    1. If you want to escape wrapped emphasis characters like in this example

      In A*68sff, and A*sds
      

      escape both of them:

      In A\*68sff, and A\*sds
      
    2. If you want to escape a single emphasis character to obtain:

      In A*68sff, and Asds
      

      because it is ambiguous, you should use Automatic Escaping for Special Characters and embed any HTML entity (in this case *) directly into the Markdown source, like so:

      In A * 68sff, and Asds
      

    This is widely considered to have the drawback of making the Markdown source uglier and harder to read. GitHub supports both Markdown and reStructuredText, so if you have a need to write more complex readme files consider using reStructuredText. Using it the problem in this question could be solved by simply escaping the isolated emphasis character (see the second bullet point) without needing to embed HTML entities.