Search code examples

how to create a new QsciLexer instance without subclassing QsciLexerCustom?

ive discovered these anonymous enumurators


i wanna set the above as the current lexer of the scintilla editor instance.

for implemented lexers i know i can do that by self.editor.setLexer(QsciLexerPython(self.editor)), etc. from asm, avs, bash to vhdl, xml and yaml.

but there's also unimplemented stuff over there in the enum like SCLEX_FORTH, SCLEX_ERLANG, SCLEX_GUI4CLI etc.

ive tried this

self.editor.SendScintilla(self.editor.SCI_SETLEXER, self.editor.SCLEX_ERLANG)

but it doesnt work. the python interpreter indeed parse it error-free but theres just no highlightings and foldings.

so basically my question is how to set the lexer using those enumerators without reimplementing and subclassing QsciLexerCustom, as u know, dont repeat yourself and rebuild the wheels twice.

any help 'll be appreciated!

Btw im using python 3.12 and PyQt5 Qscintilla 2.14.1 on my Windows 11 PC


  • TL;DR: Yes, you can.

    You do not need to subclass QsciLexerCustom and define a whole lexical rules from scratch. Instead, what you are looking for is that the bare-inheriting QsciLexer subclassing.

    Without other nonsense introduction let me show you whole reproductive example code first (sorry for not being a minimal one):

    (From your provided code snippet I believe you are using Python so I wrote the code in Python, but the format is the same for C++):

    # This defines the interface to the abstract QsciLexerAsm class.
    from PyQt5.QtCore import *
    from PyQt5.QtWidgets import *
    from PyQt5.QtGui import *
    from PyQt5.Qsci import *
    import platform
    import sys
    # For testing purposes only.
    if "QsciLexerAsm" in dir():
        del QsciLexerAsm
    class QsciLexerAsm(QsciLexer):
        propertyChanged = pyqtSignal(str, str)
        Default: int = 0
        Comment: int = 1
        Number: int = 2
        DoubleQuotedString: int = 3
        Operator: int = 4
        Identifier: int = 5
        CPUInstruction: int = 6
        FPUInstruction: int = 7
        Register: int = 8
        Directive: int = 9
        DirectiveOperand: int = 11
        BlockComment: int = 12
        SingleQuotedString: int = 13
        UnclosedString: int = 14
        ExtendedInstruction: int = 16
        CommentDirective: int = 17
        __fold_comments: bool
        __fold_compact: bool
        __comment_delimiter: str
        __fold_syntax_based: bool
        def __init__(self, parent: QObject=None) -> None:
            super(QsciLexerAsm, self).__init__(parent)
            self.__fold_comments = True
            self.__fold_compact = True
            self.__comment_delimiter = '~'
            self.__fold_syntax_based = True
        def __del__(self) -> None:
            del self
        def language(self) -> str:
            return "ASM"
        def lexer(self) -> str:
            return "asm"
        def defaultColor(self, style: int) -> QColor:
            match style:
                case QsciLexerAsm.Comment | QsciLexerAsm.BlockComment:
                    return QColor(0x00, 0x7f, 0x00)
                case QsciLexerAsm.Number:
                    return QColor(0x00, 0x7f, 0x7f)
                case QsciLexerAsm.DoubleQuotedString | QsciLexerAsm.SingleQuotedString:
                    return QColor(0x7f, 0x00, 0x7f)
                case QsciLexerAsm.Operator | QsciLexerAsm.UnclosedString:
                    return QColor(0x00, 0x00, 0x00)
                case QsciLexerAsm.CPUInstruction:
                    return QColor(0x00, 0x00, 0x7f)
                case QsciLexerAsm.FPUInstruction | QsciLexerAsm.Directive | QsciLexerAsm.DirectiveOperand:
                    return QColor(0x00, 0x00, 0xff)
                case QsciLexerAsm.Register:
                    return QColor(0x46, 0xaa, 0x03)
                case QsciLexerAsm.ExtendedInstruction:
                    return QColor(0xb0, 0x00, 0x40)
                case QsciLexerAsm.CommentDirective:
                    return QColor(0x66, 0xaa, 0x00)
            return super(QsciLexerAsm, self).defaultColor(style)
        def defaultEolFill(self, style: int) -> bool:
            if style == QsciLexerAsm.UnclosedString:
                return True
            return super(QsciLexerAsm, self).defaultEolFill(style)
        def defaultFont(self, style: int) -> QFont:
            f: QFont = QFont()
            match style:
                case QsciLexerAsm.Operator | QsciLexerAsm.CPUInstruction | QsciLexerAsm.Register:
                    f = super(QsciLexerAsm, self).defaultFont(style)
                case QsciLexerAsm.Comment | QsciLexerAsm.BlockComment:
                    if platform.system() == "Windows":
                        f = QFont("Comic Sans MS", 9)
                    elif platform.system() == "Darwin":
                        f = QFont("Comic Sans MS", 12)
                        f = QFont("Bitstream Vera Serif", 9)
                case _:
                    f = super(QsciLexerAsm, self).defaultFont(style)
            return f
        def defaultPaper(self, style: int) -> QColor:
            if style == QsciLexerAsm.UnclosedString:
                return QColor(0xe0, 0xc0, 0xe0)
            return super(QsciLexerAsm, self).defaultPaper(style)
        def keywords(self, set_: int) -> str:
            if set_ == 1:
                return (
                    "" # <- CPU instructions, replace this by copying official docs
            if set_ == 2:
                return (
                    "" # <- FPU instructions, replace this by copying official docs
            if set_ == 3:
                return (
                    "" # <- Register names, replace this by copying official docs
            if set_ == 4:
                return (
                    "" # <- Directives, replace this by copying official docs
            if set_ == 5:
                return (
                    "" # <- Directive Operands, replace this by copying official docs
            if set_ == 6:
                return (
                    "" # <- Extended Instructions, replace this by copying official docs
            return ""
        def description(self, style: int) -> str:
            match style:
                case QsciLexerAsm.Default:
                case QsciLexerAsm.Comment:
                case QsciLexerAsm.Number:
                case QsciLexerAsm.DoubleQuotedString:
                    return"Double-quoted string")
                case QsciLexerAsm.Operator:
                case QsciLexerAsm.Identifier:
                case QsciLexerAsm.CPUInstruction:
                    return"CPU instruction")
                case QsciLexerAsm.FPUInstruction:
                    return"FPU instruction")
                case QsciLexerAsm.Register:
                case QsciLexerAsm.Directive:
                case QsciLexerAsm.DirectiveOperand:
                    return"Directive operand")
                case QsciLexerAsm.BlockComment:
                    return"Block comment")
                case QsciLexerAsm.SingleQuotedString:
                    return"Single-quoted string")
                case QsciLexerAsm.UnclosedString:
                    return"Unclosed string")
                case QsciLexerAsm.ExtendedInstruction:
                    return"Extended instruction")
                case QsciLexerAsm.CommentDirective:
                    return"Comment directive")
            return ""
        def refreshProperties(self) -> None:
        def foldComments(self) -> bool:
            return self.__fold_comments
        def foldCompact(self) -> bool:
            return self.__fold_compact
        def commentDelimiter(self) -> str:
            return self.__comment_delimiter
        def foldSyntaxBased(self) -> bool:
            return self.__fold_syntax_based
        def setFoldComments(self, fold: bool) -> None:
            self.__fold_comments = fold
        def setFoldCompact(self, fold: bool) -> None:
            self.__fold_compact = fold
        def setCommentDelimiter(self, delimiter: str) -> None:
            self.__comment_delimiter = delimiter
        def setFoldSyntaxBased(self, syntax_based: bool) -> None:
            self.__fold_syntax_based = syntax_based
        def readProperties(self, qs: QSettings, prefix: str) -> bool:
            self.__fold_comments = bool(qs.value(prefix + "foldcomments", True))
            self.__fold_compact = bool(qs.value(prefix + "foldcompact", True))
            self.__comment_delimiter = str(qs.value(prefix + "commentdelimiter",
            self.__fold_syntax_based = bool(qs.value(prefix + "foldsyntaxbased", True))
            return True
        def writeProperties(self, qs: QSettings, prefix: str) -> bool:
            qs.setValue(prefix + "foldcomments", self.__fold_comments)
            qs.setValue(prefix + "foldcompact", self.__fold_compact)
            qs.setValue(prefix + "commentdelimiter", self.__comment_delimiter)
            qs.setValue(prefix + "foldsyntaxbased", self.__fold_syntax_based)
            return True
        def __setCommentProp(self):
                    ("1" if self.__fold_comments else 0))
        def __setCompactProp(self):
            self.propertyChanged.emit("fold.compact", ("1" if self.__fold_compact else 0))
        def __setCommentDelimiterProp(self):
        def __setSyntaxBasedProp(self):
                    ("1" if self.__fold_syntax_based else 0))
    # Example
    if __name__ == "__main__":
        app: QApplication = QApplication(sys.argv)
        w: QsciScintilla = QsciScintilla()
        lexer: QsciLexerAsm = QsciLexerAsm(w)

    I have only had the Asm custom workaround lexer implementation, but hearing you that you're using QScintilla 2.14 I'm sorry that it's implemented so I cannot provide you a good example, as it is already implemented. But the logic is the same.

    The result should looks like (after you had inserted all sorts of keywords and built-in identifiers, of course):

    ASM Lexer Result

    Running on Windows 11, using Python 3.12 as you mentioned. Looks identical as the official lexer isn't it?


    Let's take a closer look. The magic that enables and empowers this workaround method lies here:

        def lexer(self) -> str:
            return "asm"

    The editor reads the lexical identifier and sets the internal lexers correspondingly, and it's the only way to adopt a lexer on an editor. I guess that's why your tried code doesn't work as those enumerators are reserved for internal use only. And that's also the reason why it's a must to reimplement this method after you had inherited the QsciLexer abstract class.

    Might you ask, what's the identifier for those lexers, for example Baan, Gui4Cli, etc.?

    They are defined in the /scintilla/include/SciLexer.h header. Since it's a pure C++ file, you may not be able to find it in the Python package. So you need to navigate to here to take a closer look.

    As you can see, there are enumerators, just like what you mentioned, from line 17, SCLEX_CONTAINER, SCLEX_NULL, SCLEX_PYTHON, SCLEX_CPP, SCLEX_HTML, etc., till SCLEX_AUTOMATIC, line 142. The name of the enumerator is, without an exception, the identifier of the lexer and must be returned by the QsciLexer.lexer() -> str or const char *QsciLexer::lexer() overriden method.

    Token types

    For example, Default, Comment, UnclosedString in QsciLexerCPP; CPUInstruction in QsciLexerAsm, etc. You can also find it in SciLexer.h, from line 143 onward, with the macro (, or the identifier after #define) starts with SCE_. The name following is almost the same as the lexer identifier that is returned by lexer(), but there are some special conventions:

    • Legend: Convention = "lexer identifier"
    • P = "python"
    • C = "cpp" or "cppnocase"
    • H = "html"
    • HJ = "html"'s JavaScript section
    • HJA = "html"'s ASP JavaScript section
    • HB = "html"'s VB Script section
    • HBA = "html"'s ASP VB Script section
    • HP = "html"'s Python section
    • HPA = "html"'s ASP Python section
    • HPHP = "html"'s PHP section
    • PL = "perl"
    • RB = "ruby"
    • B = "basic"
    • PROPS = "properties"
    • L = "latex"
    • Err = "errorlist"


    Properties read/write

    For example, setSmartHighlighting(bool) in QsciLexerPascal.

    You can done this by creating three methods at a time: set...(), ...() and private __set...Prop(); as well as overriding the readProperties() and writeProperties() method, as shown above in the code.

    The set...Prop()'s property string (e.g. "fold.asm.syntax.based" in __setSyntaxBasedProp()) can be found in each lexer file. For example, in QsciLexerBasic (imagination) you can have (imaginative):

        def setBasicExplicitComment(self, explicit: bool) -> None:
            self.__basic_explicit_comment = explicit
        def basicExplicitComment(self) -> bool:
            return self.__basic_explicit_comment
        def __setExplicitCommentProp(self) -> None:
            self.propertyChanged.emit("fold.basic.comment.explicit", ("1" if self.__basic_explicit_comment else "0")

    Source: here

    Remember, you need to define the propertyChanged = pyqtSignal(str, str) signal as class attribute (don't define it in __init__!) and the value for bool in native Scintilla is str-value "1" for True and "0" for False rather than bool or the implicit 1 or 0!

    If you have anything you don't understand, don't hesitate and comment me down below.