I am writing a pdf parsing library.
Once upon a time, I had an input to parse like this one:
1 0 obj
(anything)
endobj
I've created parsing rule for the outer container and then separate rule for the inner object:
CONTAINER_PARSER %=
number >> number >> "obj" >> OBJECT_PARSER >> "endobj";
OBJECT_PARSER %= number | value | ...
This worked without any problems. But, for various reasons a I had to redesign the rules, so that both container values belongs to the object itself.
The container itself is only optional. Meaning, the previous code and the following denotes the same object, without the additional container info:
(anything)
I had 2 ideas, how to solve this problem, but it seems to me, that both are incompatible with Qi approach.
I wanted to tell the parser, to parse either value contained inside obj - endobj, or to parse only the value.
start %=
(
object_number
>> generation_number
>> qi::lit("obj")
>> object
> qi::lit("endobj")
) | object;
// I intentionally missed some semantic actions assigning the values to the object,
because it is out of the scope of my problem
I didn't manage to make this work, because both parts of the alternation has the same exposed attribute, and the compiler was confused.
I've tried to tell the parser, that the former container is only optional to the parsed value.
start %=
-(
object_number
>> generation_number
>> qi::lit("obj")
)
>> object
> -qi::lit("endobj");
Problem with this approach is, that the last part "endobj" has to be present, if the first part is present as well.
The solution might be trivial, but I was really not able to figure it out from either code, documentation and stackoverflow answers.
UPDATE After the comment:
start =
(
( object_number >> generation_number
| qi::attr(1) > qi::attr(0) // defaults
) >> "obj" >> object > "endobj"
| qi::attr(1) >> qi::attr(0) >> object
)
;
Assuming you're not interested in the (optional) numbers:
start =
-qi::omit [ object_number >> generation_number ]
>> "obj" >> object > "endobj"
;
If you are interested and have suitable defaults:
start =
( object_number >> generation_number
| qi::attr(1) > qi::attr(0) // defaults
)
>> "obj" >> object > "endobj"
;
Of course, you could
alter the recipient type to expect optional<int>
for the object_numbers so you could simply -object_number >> -generation_number
; This would be kinda sloppy since it also allows "1 obj (anything) endobj"
alter the recipient type to be a variant:
boost::variant<simple_object, object_contaier>
in this case your AST matches the "alternative" approach (first one) from your question