Search code examples
regexsyntax-highlightingtext-editortextmate2

Use alternate syntax highlighting in middle of TextMate2 comment


By the very nature of a comment, this might not make sense.

On the other hand, what I'm trying to achieve is not too different from an escape character.

As a simple example, I want # comment :break: comment to show up more like like

#comment 
"break" 
# comment

would, but without the second #, everything is on the same line, and instead of quotes I have some other escape character. Although, like quotes (and unlike escape characters that I'm familiar with [e.g., \]), I intend to explicitly indicate the beginning and the end of the interruption to the comment.

Thanks to @Graham P Heath, I was able to achieve alternate forms of comments in this question. What I'm after is an enhancement to what was achieved there. In my scenario, # is a comment in the language I'm using (R), and #' functions both as an R comment and as the start of code in another language. Now, I can get everything after the #' to take on syntax highlighting that is different from the typical R comment, but I'm trying to get a very modest amount of syntax highlighting in this sub-language (#' actually indicates the start of markdown code, and I want the "raw" syntax highlighting for text surround in a pair of ` ).

The piece of the language grammar that I'm trying to interrupt is as follows:

{   begin = '(^[ \t]+)?(?=#'' )';
            end = '(?!\G)';
            beginCaptures = { 1 = { name = 'punctuation.whitespace.comment.leading.r'; }; };
            patterns = (
                {   name = 'comment.line.number-sign-tick.r';
                    begin = "#' ";
                    end = '\n';
                    beginCaptures = { 0 = { name = 'punctuation.definition.comment.r'; }; };
                },
            );
        },

Solution

  • I'm pretty sure I've figured it out. What I didn't understand previously was how the scoping worked. I still don't understand it fully, but I now know enough to create nested definitions (regex) for the begin and end of each type of syntax.

    The scoping makes things so much easier! Previously I wanted to do regex like (?<=\A#'\s.*)(\$) to find a dollar sign within the #'-style comment ... but obviously that won't work because of the repetition with * (+ wouldn't work for the same reason). Via scoping, it's already implied that we have to be inside the \A#'\s match before \$ will be matched.

    Here is the relevant portion of my Language Grammar:

    {   begin = '(^[ \t]+)?(?=#\'' )';
                end = '(?!\G)';
                beginCaptures = { 1 = { name = 'punctuation.whitespace.comment.leading.r'; }; };
                patterns = (
    
                    {   name = 'comment.line.number-sign-tick.r';
                        begin = "#' ";
                        end = '\n';
                        beginCaptures = { 0 = { name = 'punctuation.definition.comment.r'; }; };
    
    
                        patterns = (
    
                            // Markdown within Comment
                            {   name = 'comment.line.number-sign-tick-raw.r';
                                begin = '(`)(?!\s)'; // backtick not followed by whitespace
                                end = '(?<!\s)(`)'; // backtick not preceded by whitespace
                                beginCaptures = { 0 = { name = 'punctuation.definition.comment.r'; }; };
                            },
    
                            // Equation within comment
                            {   name = 'comment.line.number-sign-tick-eqn.r';
                                begin = '((?<!\G)([\$]{1,2})(?!\s))';
                                end = '(?<!\s)([\$]{1,2})';
                                beginCaptures = { 0 = { name = 'punctuation.definition.comment.r'; }; };
    
                                // Markdown within Equation
                                patterns = (
                                    {   name = 'comment.line.number-sign-tick-raw.r';
                                        begin = '(`)(?!\s)'; // backtick not followed by whitespace
                                        end = '(?<!\s)(`)'; // backtick not preceded by whitespace
                                        beginCaptures = { 0 = { name = 'punctuation.definition.comment.r'; }; };
                                    },
                                );
                            },
                        );
                    },
    
                );
            },
    

    here is some R code:

    # below is a `knitr` (note no effect of backticks) code chunk
    #+ codeChunk, include=FALSE
    
    
    # normal R comment, follow by code
    data <- matrix(rnorm(6,3, sd=7), nrow=2)
    
    #' This would be recognized as markdown by `knitr::spin()`, with the preceding portion as "raw" text
    `note that this doesnt go to the 'raw' format ... it is normal code!`
    
    #+ anotherChunk
    # also note how the dollar signs behave normally
    data <- as.list(data)
    data$blah <- "blah"
    `data`[[1]] # backticks behaving
    
    #' I can introduce a Latex-style equation, filling in values from R using `knitr` code chunks: $\frac{top}{bottom}=\frac{`r topValue`}{`r botValue`}$ then continue on with markdown.
    

    And here is what that looks like in TextMate2 after making these changes: enter image description here

    Pretty good, except the backticked pieces take on the italics when they're inside an equation. I can live with that. I can even convince myself that I wanted it that way ;) (by the way, I specified fontName='regular' for the courier new, so I don't know why that's getting overridden)