I try to write the Xtext BNF for Configuration files (known with the .ini
extension)
For instance, I'd like to successfully parse
[Section1]
a = Easy123
b = This *is* valid too
[Section_2]
c = Voilà # inline comments are ignored
My problem is matching the property value (what's on the right of the '=').
My current grammar works if the property matches the ID
terminal (eg a = Easy123
).
PropertyFile hidden(SL_COMMENT, WS):
sections+=Section*;
Section:
'[' name=ID ']'
(NEWLINE properties+=Property)+
NEWLINE+;
Property:
name=ID (':' | '=') value=ID ';'?;
terminal WS:
(' ' | '\t')+;
terminal NEWLINE:
// New line on DOS or Unix
'\r'? '\n';
terminal ID:
('A'..'Z' | 'a'..'z') ('A'..'Z' | 'a'..'z' | '_' | '-' | '0'..'9')*;
terminal SL_COMMENT:
// Single line comment
'#' !('\n' | '\r')*;
I don't know how to generalize the grammar to match any text (eg c = Voilà
).
I certainly need to introduce a new terminal Property: name=ID (':' | '=') value=TEXT ';'?;
Question is: how should I define this TEXT
terminal?
I have tried
terminal TEXT: ANY_OTHER+;
This raises a warning
The following token definitions can never be matched because prior tokens match the same input: RULE_INT,RULE_STRING,RULE_ML_COMMENT,RULE_ANY_OTHER
(I think it doesn't matter).
Parsing Fails with
Required loop (...)+ did not match anything at input 'à'
terminal TEXT: !('\r'|'\n'|'#')+;
This raises a warning
The following token definitions can never be matched because prior tokens match the same input: RULE_INT
(I think it doesn't matter).
Parsing Fails with
Missing EOF at [Section1]
terminal TEXT: ('!'|'$'..'~');
(which covers most characters, except #
and "
)
No warning during the generation of the lexer/parser.
However Parsing Fails with
Mismatch input 'Easy123' expecting RULE_TEXT
Extraneous input 'This' expecting RULE_TEXT
Required loop (...)+ did not match anything at 'is'
Thanks for your help (and I hope this grammar can be useful for others too)
This grammar does the trick:
grammar org.xtext.example.mydsl.MyDsl hidden(SL_COMMENT, WS)
generate myDsl "http://www.xtext.org/example/mydsl/MyDsl"
import "http://www.eclipse.org/emf/2002/Ecore"
PropertyFile:
sections+=Section*;
Section:
'[' name=ID ']'
(NEWLINE+ properties+=Property)+
NEWLINE+;
Property:
name=ID value=PROPERTY_VALUE;
terminal PROPERTY_VALUE: (':' | '=') !('\n' | '\r')*;
terminal WS:
(' ' | '\t')+;
terminal NEWLINE:
// New line on DOS or Unix
'\r'? '\n';
terminal ID:
('A'..'Z' | 'a'..'z') ('A'..'Z' | 'a'..'z' | '_' | '-' | '0'..'9')*;
terminal SL_COMMENT:
// Single line comment
'#' !('\n' | '\r')*;
Key is, that you do not try to cover the complete semantics only in the grammar but take other services into account, too. The terminal rule PROPERTY_VALUE
consumes the complete value including leading assignment and optional trailing semicolon.
Now just register a value converter service for that language and take care of the insignificant parts of the input, there:
import org.eclipse.xtext.conversion.IValueConverter;
import org.eclipse.xtext.conversion.ValueConverter;
import org.eclipse.xtext.conversion.ValueConverterException;
import org.eclipse.xtext.conversion.impl.AbstractDeclarativeValueConverterService;
import org.eclipse.xtext.conversion.impl.AbstractIDValueConverter;
import org.eclipse.xtext.conversion.impl.AbstractLexerBasedConverter;
import org.eclipse.xtext.nodemodel.INode;
import org.eclipse.xtext.util.Strings;
import com.google.inject.Inject;
public class PropertyConverters extends AbstractDeclarativeValueConverterService {
@Inject
private AbstractIDValueConverter idValueConverter;
@ValueConverter(rule = "ID")
public IValueConverter<String> ID() {
return idValueConverter;
}
@Inject
private PropertyValueConverter propertyValueConverter;
@ValueConverter(rule = "PROPERTY_VALUE")
public IValueConverter<String> PropertyValue() {
return propertyValueConverter;
}
public static class PropertyValueConverter extends AbstractLexerBasedConverter<String> {
@Override
protected String toEscapedString(String value) {
return " = " + Strings.convertToJavaString(value, false);
}
public String toValue(String string, INode node) {
if (string == null)
return null;
try {
String value = string.substring(1).trim();
if (value.endsWith(";")) {
value = value.substring(0, value.length() - 1);
}
return value;
} catch (IllegalArgumentException e) {
throw new ValueConverterException(e.getMessage(), node, e);
}
}
}
}
The follow test case will succeed, after you registered the service in the runtime module like this:
@Override
public Class<? extends IValueConverterService> bindIValueConverterService() {
return PropertyConverters.class;
}
Test case:
import org.junit.runner.RunWith
import org.eclipse.xtext.junit4.XtextRunner
import org.xtext.example.mydsl.MyDslInjectorProvider
import org.eclipse.xtext.junit4.InjectWith
import org.junit.Test
import org.eclipse.xtext.junit4.util.ParseHelper
import com.google.inject.Inject
import org.xtext.example.mydsl.myDsl.PropertyFile
import static org.junit.Assert.*
@RunWith(typeof(XtextRunner))
@InjectWith(typeof(MyDslInjectorProvider))
class ParserTest {
@Inject
ParseHelper<PropertyFile> helper
@Test
def void testSample() {
val file = helper.parse('''
[Section1]
a = Easy123
b : This *is* valid too;
[Section_2]
# comment
c = Voilà # inline comments are ignored
''')
assertEquals(2, file.sections.size)
val section1 = file.sections.head
assertEquals(2, section1.properties.size)
assertEquals("a", section1.properties.head.name)
assertEquals("Easy123", section1.properties.head.value)
assertEquals("b", section1.properties.last.name)
assertEquals("This *is* valid too", section1.properties.last.value)
val section2 = file.sections.last
assertEquals(1, section2.properties.size)
assertEquals("Voilà # inline comments are ignored", section2.properties.head.value)
}
}