Search code examples
sublimetext3syntax-highlightingsmalltalk

How to correctly embed another language into my current language defintion?


I have Smalltalk sublime-syntax file (YAML) for Sublime Text 3 and I would like to add an highlighting support for embedded C code.

The inline C (which always starts with ^%\{ and ends with %\}$) code among the smalltalk code.

A simple example (not much C but wanted a simple case):

sigABRT
    "return the signal number for SIGABRT - 0 if not supported by OS
     (the numeric value is not the same across unix-systems)"

%{  /* NOCONTEXT */
#ifdef SIGABRT
    RETURN ( __mkSmallInteger(SIGABRT) );
#else
    RETURN ( __mkSmallInteger(0) );
#endif
%}
!

There is new feature embed in Sublime text (with even an example).

I tried to do something like this:

- match: '^%\{'
  embed: scope:source.c
  embed_scope: meta.environment.embedded.c.smalltalk source.c.embedded
  escape: '%\}$'

However, I was unable to correctly incorporate it into my current highlighting file.

Does anyone know how to correctly embed one language to another?


Solution

  • This question is a little sticky because you've provided a sample syntax definition and a sample of some Smalltalk source code, but the code provided is not highlighted by the provided syntax because it's not structured properly.

    For our purposes here, lets assume that the Smalltalk sample you provided is the following one. This may or may not be valid (it's been a long time since I've worked with Smalltalk) but it highlights with your syntax, so lets call that good enough for testing purposes.

    Object subclass: Test [
        sigABRT
            "return the signal number for SIGABRT - 0 if not supported by OS
             (the numeric value is not the same across unix-systems)"
    
        %{  /* NOCONTEXT */
        #ifdef SIGABRT
            RETURN ( __mkSmallInteger(SIGABRT) );
        #else
            RETURN ( __mkSmallInteger(0) );
        #endif
        %}
        !
    ].
    

    The syntax match that you provided above is the correct one to use, so I'm guessing that your problem is in where you placed it in the syntax.

    So lets presume that there is more than one place where we might want to match one of these C blocks in the syntax definition; in that case we may want to create a new context in the syntax that contains the match so that we can include it where it's needed:

    c-block:
      - match: '%\{'
        embed: scope:source.c
        embed_scope: meta.environment.embedded.c.smalltalk source.c.embedded
        escape: '%\}$'
    

    This is the same excerpt as you provided above, but placed into a context. So lets say that the first place where a block such as this can appear is in the body of a block. You have a block-body context in your syntax, so we stick an include onto the end of it to include this new context:

    block-body:
      - include: pragma
      - include: selector
      - include: literal
      - include: block
      - include: comment
      - include: c-block
    

    However, this does not have the desired outcome; the highlight is not correct:

    Incorrect Highlighting

    Clearly the highlighting is going wrong starting at least at the C comment start, possibly earlier. If you use Tools > Developer > Show Scope Name while the cursor is on the comment, you can see that the scope assigned is source.smalltalk entity.name.function, which means that the syntax is treating the C comment start as a method name.

    It also looks like the %{ construct is not properly highlighted, and a check shows that the scope of the % character is source.smalltalk keyword.other.

    So in reality the problem currently is that with the above definitions in place, instead of seeing %{ as starting a C block, it's being seen as a keyword, and if it's a keyword then the rules for matching a C block are not triggering at all.

    If you look at your syntax, the main context looks like this:

    main:
      - match: '([a-zA-Z][a-zA-Z0-9]*)\s*(subclass:)\s*([a-zA-Z][a-zA-Z0-9]*)\s*\['
        captures:
          1: entity.other.inherited-class
          2: keyword.other
          3: entity.name.type
        push:
          - match: '\]'
            pop: true
          - include: pragma
          - match: '(([a-zA-Z][a-zA-Z0-9]*:)|[+\-\/\\*~<>=@%|&?!.,:;^]+)\s*([a-zA-Z][a-zA-Z0-9]*)'
            captures:
              1: entity.name.function
              3: variable.other
          - match: "([a-zA-Z][a-zA-Z0-9]*)"
            scope: entity.name.function
          - include: block
          - include: comment
          - include: block-body
    

    These rules say that when we see a line that starts with something like BaseClass subclass: SubClass [, we are entering into an anonymous context (via the push) to handle the contents of the class body (or block or whatever).

    The anonymous context contains the rule to pop out when it sees the closing ] character, two different matches to find a function name, then an include on the contexts for block, comment and block-body respectively.

    When you include, Sublime takes all of the match rules from that context and inserts a copy of them at the point where you do the insert, as if you had just manually entered them there.

    In addition, when there is more than one rule in a context that might potentially match, the first match rule in the context is the one that is applied (i.e. it "wins" the tie).

    The scope keyword.other is applied in the rules from the pragma context as well as the selector context, and the selector context can match single % character as a keyword.

    Thus the problem here is that since the include c-block appears after selector in the include list for the block-body context, the selector context is finding and matching the % character before the rule for the C block can find it.

    The solution then would be to shift the location of the include c-block to be prior to that item to make sure that it matches first:

    block-body:
      - include: c-block
      - include: pragma
      - include: selector
      - include: literal
      - include: block
      - include: comment
    

    With that in place, the block highlights more like we would expect:

    Correct Highlighting