I'm looking for a grammar of PDF 1.7 (BNF or variant)
absolutely not googleable
PDF is a binary format that is not context-free. In PDF for example you need to read and interpret the size of a binary stream before parsing the stream.
Example:
10 0 obj
<</Type /XObject
/Subtype /Image
/Width 260
/Height 52
/ColorSpace /DeviceRGB
/SMask 10 0 R
/BitsPerComponent 8
/Filter /FlateDecode
/Length 4570>> stream
--- insert binary data here ---
endstream
endobj
There is no way to tell if your binary data will contain the tokens endstream
or endobj
inside, so you have no other choice than reading the length of the stream before parsing it.
BNF can only be used for context-free grammars, so it is not possible to construct a BNF grammar for PDF.
Take a look at the specification here: PDF Reference Document