The ruby syntax highlighting is not working properly when using regexes.
It looks like multiple issues are happening here.
#
) and messes up the whole syntax highlighting from that point on on that line."
and '
in the line of the string_literal
is messing up from that point on until the end of the file. Which is much more serious.class Tokenizer
def initialize(expression)
@expression = expression
end
TOKEN_REGEX = /
(?<whitespace>\s+) |
(?<parenthesis>[\(\)]) |
(?<comparison_operator>#{ComparisonNode::OPERATORS.map { |op| Regexp.escape(op) }.join('|')}) |
(?<logical_operator>\b(?:#{LogicalNode::OPERATORS.join('|')})\b) |
(?<boolean_literal>\b(?:#{ValueNode::BOOLEAN_LITERALS.join('|')})\b) |
(?<number_literal>\d+) |
(?<string_literal>"[^"]*"|'[^']*') |
(?<identifier>[a-z_][a-z0-9_\.]*) |
(?<unknown>.)
/ix.freeze
def tokenize
tokens = []
@expression.scan(TOKEN_REGEX) do
match_data = Regexp.last_match
if match_data[:whitespace]
next
elsif match_data[:parenthesis]
tokens << Token.new(:parenthesis, match_data[0])
elsif match_data[:comparison_operator]
tokens << Token.new(ComparisonNode::TYPE, match_data[0])
elsif match_data[:logical_operator]
tokens << Token.new(LogicalNode::TYPE, match_data[0].upcase)
elsif match_data[:boolean_literal]
tokens << Token.new(:literal, match_data[0].downcase)
elsif match_data[:number_literal]
tokens << Token.new(:literal, match_data[0])
elsif match_data[:string_literal]
value = match_data[0][1...-1] # Remove surrounding quotes
tokens << Token.new(:literal, value)
elsif match_data[:identifier]
tokens << Token.new(FieldNode::TYPE, match_data[0])
else
raise "Unexpected character: #{match_data[0]}"
end
end
tokens
end
end
Initially, this is happening with the builtin ruby syntax highlight from the Sublime Text 3 (Version 3.2.2, Build 3211). I tried to install ruby syntax highlighting specific packages that tries to fix this issue, such as Sublime Better Ruby, but without success.
Is there someone with the same issue? If so, how did you fix it? Thanks!
Sublime Text Ruby Syntax takes an opinionated view that multi-line Regexps generally use the %r
literal syntax.
So using / /
only works correctly if the leading and trailing forward slash are on the same line.
As shown in Ruby.sublime-syntax. I linked v3211 because that is your stated version but the same applies to all versions before and up through v4108. It appears this was patched in v4109
try-regex:
# Generally for multiline regexes, one of the %r forms below will be used,
# so we bail out if we can't find a second / on the current line
- match: '\s*(/)(?![*+{}?])(?=.*/)'
captures:
1: string.regexp.classic.ruby punctuation.definition.string.ruby
push:
- meta_content_scope: string.regexp.classic.ruby
- match: "(/)([eimnosux]*)"
scope: string.regexp.classic.ruby
captures:
1: punctuation.definition.string.ruby
2: keyword.other.ruby
pop: true
- include: regex-sub
- match: ''
pop: true
Knowing this you can alter your code to:
TOKEN_REGEX = %r{
(?<whitespace>\s+) |
(?<parenthesis>[\(\)]) |
(?<comparison_operator>#{ComparisonNode::OPERATORS.map { |op| Regexp.escape(op) }.join('|')}) |
(?<logical_operator>\b(?:#{LogicalNode::OPERATORS.join('|')})\b) |
(?<boolean_literal>\b(?:#{ValueNode::BOOLEAN_LITERALS.join('|')})\b) |
(?<number_literal>\d+) |
(?<string_literal>"[^"]*"|'[^']*') |
(?<identifier>[a-z_][a-z0-9_\.]*) |
(?<unknown>.)
}ix.freeze
and the syntax highlighting works as expected.
As an aside Regexp::union provides a means for unioning an Array
of values so you don't need to manually join or escape. This means you could just use:
(?<comparison_operator>#{Regexp.union(ComparisonNode::OPERATORS)}) |
(?<logical_operator>\b(?:#{Regexp.union(LogicalNode::OPERATORS)})\b) |
(?<boolean_literal>\b(?:#{Regexp.union(ValueNode::BOOLEAN_LITERALS)})\b) |