Search code examples
xtextbnf

Matching a "text" in line by line file with XText


I try to write the Xtext BNF for Configuration files (known with the .ini extension)

For instance, I'd like to successfully parse

[Section1]
a = Easy123
b = This *is* valid too

[Section_2]
c = Voilà # inline comments are ignored

My problem is matching the property value (what's on the right of the '=').

My current grammar works if the property matches the ID terminal (eg a = Easy123).

PropertyFile hidden(SL_COMMENT, WS):
    sections+=Section*;

Section:
    '[' name=ID ']'
    (NEWLINE properties+=Property)+
    NEWLINE+;

Property:
    name=ID (':' | '=') value=ID ';'?;

terminal WS:
    (' ' | '\t')+;

terminal NEWLINE:
// New line on DOS or Unix 
    '\r'? '\n';

terminal ID:
    ('A'..'Z' | 'a'..'z') ('A'..'Z' | 'a'..'z' | '_' | '-' | '0'..'9')*;

terminal SL_COMMENT:
// Single line comment
    '#' !('\n' | '\r')*;

I don't know how to generalize the grammar to match any text (eg c = Voilà).

I certainly need to introduce a new terminal Property: name=ID (':' | '=') value=TEXT ';'?;

Question is: how should I define this TEXT terminal?

I have tried

  • terminal TEXT: ANY_OTHER+; This raises a warning

    The following token definitions can never be matched because prior tokens match the same input: RULE_INT,RULE_STRING,RULE_ML_COMMENT,RULE_ANY_OTHER

    (I think it doesn't matter).

    Parsing Fails with

    Required loop (...)+ did not match anything at input 'à'

  • terminal TEXT: !('\r'|'\n'|'#')+; This raises a warning

    The following token definitions can never be matched because prior tokens match the same input: RULE_INT

    (I think it doesn't matter).

    Parsing Fails with

    Missing EOF at [Section1]

  • terminal TEXT: ('!'|'$'..'~'); (which covers most characters, except # and ") No warning during the generation of the lexer/parser. However Parsing Fails with

    Mismatch input 'Easy123' expecting RULE_TEXT

    Extraneous input 'This' expecting RULE_TEXT

    Required loop (...)+ did not match anything at 'is'

Thanks for your help (and I hope this grammar can be useful for others too)


Solution

  • This grammar does the trick:

    grammar org.xtext.example.mydsl.MyDsl hidden(SL_COMMENT, WS)
    
    generate myDsl "http://www.xtext.org/example/mydsl/MyDsl"
    import "http://www.eclipse.org/emf/2002/Ecore"
    
    PropertyFile:
        sections+=Section*;
    
    Section:
        '[' name=ID ']' 
        (NEWLINE+ properties+=Property)+
        NEWLINE+;
    
    Property:
        name=ID value=PROPERTY_VALUE;
    
    terminal PROPERTY_VALUE: (':' | '=') !('\n' | '\r')*;
    
    terminal WS:
        (' ' | '\t')+;
    
    terminal NEWLINE:
    // New line on DOS or Unix 
        '\r'? '\n';
    
    terminal ID:
        ('A'..'Z' | 'a'..'z') ('A'..'Z' | 'a'..'z' | '_' | '-' | '0'..'9')*;
    
    terminal SL_COMMENT:
    // Single line comment
        '#' !('\n' | '\r')*;
    

    Key is, that you do not try to cover the complete semantics only in the grammar but take other services into account, too. The terminal rule PROPERTY_VALUE consumes the complete value including leading assignment and optional trailing semicolon.

    Now just register a value converter service for that language and take care of the insignificant parts of the input, there:

    import org.eclipse.xtext.conversion.IValueConverter;
    import org.eclipse.xtext.conversion.ValueConverter;
    import org.eclipse.xtext.conversion.ValueConverterException;
    import org.eclipse.xtext.conversion.impl.AbstractDeclarativeValueConverterService;
    import org.eclipse.xtext.conversion.impl.AbstractIDValueConverter;
    import org.eclipse.xtext.conversion.impl.AbstractLexerBasedConverter;
    import org.eclipse.xtext.nodemodel.INode;
    import org.eclipse.xtext.util.Strings;
    
    import com.google.inject.Inject;
    
    public class PropertyConverters extends AbstractDeclarativeValueConverterService {
        @Inject
        private AbstractIDValueConverter idValueConverter;
    
        @ValueConverter(rule = "ID")
        public IValueConverter<String> ID() {
            return idValueConverter;
        }
    
        @Inject
        private PropertyValueConverter propertyValueConverter;
    
        @ValueConverter(rule = "PROPERTY_VALUE")
        public IValueConverter<String> PropertyValue() {
            return propertyValueConverter;
        }
    
        public static class PropertyValueConverter extends AbstractLexerBasedConverter<String> {
    
            @Override
            protected String toEscapedString(String value) {
                return " = " + Strings.convertToJavaString(value, false);
            }
    
            public String toValue(String string, INode node) {
                if (string == null)
                    return null;
                try {
                    String value = string.substring(1).trim();
                    if (value.endsWith(";")) {
                        value = value.substring(0, value.length() - 1);
                    }
                    return value;
                } catch (IllegalArgumentException e) {
                    throw new ValueConverterException(e.getMessage(), node, e);
                }
            }
        }
    }
    

    The follow test case will succeed, after you registered the service in the runtime module like this:

    @Override
    public Class<? extends IValueConverterService> bindIValueConverterService() {
        return PropertyConverters.class;
    }
    

    Test case:

    import org.junit.runner.RunWith
    import org.eclipse.xtext.junit4.XtextRunner
    import org.xtext.example.mydsl.MyDslInjectorProvider
    import org.eclipse.xtext.junit4.InjectWith
    import org.junit.Test
    import org.eclipse.xtext.junit4.util.ParseHelper
    import com.google.inject.Inject
    import org.xtext.example.mydsl.myDsl.PropertyFile
    import static org.junit.Assert.*
    
    @RunWith(typeof(XtextRunner))
    @InjectWith(typeof(MyDslInjectorProvider))
    class ParserTest {
    
        @Inject
        ParseHelper<PropertyFile> helper
    
        @Test
        def void testSample() {
            val file = helper.parse('''
                [Section1]
                a = Easy123
                b : This *is* valid too;
    
                [Section_2]
                # comment
                c = Voilà # inline comments are ignored
            ''')
            assertEquals(2, file.sections.size)
            val section1 = file.sections.head
            assertEquals(2, section1.properties.size)
            assertEquals("a", section1.properties.head.name)
            assertEquals("Easy123", section1.properties.head.value)
            assertEquals("b", section1.properties.last.name)
            assertEquals("This *is* valid too", section1.properties.last.value)
    
            val section2 = file.sections.last
            assertEquals(1, section2.properties.size)
            assertEquals("Voilà # inline comments are ignored", section2.properties.head.value)
        }
    
    }