Search code examples
pdfpdf-generationbinaryfiles

Do indirect objects in pdf always have an EOL marker after the obj keyword?


The spec section 3.2.9 says:

The definition of an indirect object in a PDF file consists of its object number and generation number, followed by the value of the object itself bracketed between the keywords obj and endobj.

And then gives an example

12 0 obj
  (Brillig)
endobj

But it does not seem to mention if an EOL after the keyword obj is required. On the other hand, the spec emphasizes the necessity of EOL in case of stream keyword in section 3.2.7.

In practice, however, all pdf files I have examined seem to have an EOL after the obj keyword. Did I miss anything from the spec?


Solution

  • First of all, you had better use the actual PDF specification, i.e. ISO 32000, not one of the old PDF References which were not considered normative in nature.

    That been said, even the actual spec does not require an EOL after the obj keyword, so

    12 0 obj (A string in an indirect object) endobj 
    

    is valid.

    Actually the spec points out that white-space is used to separate the numbers. Thus, even constructs like this

    12   % A comment 
    0
    
         obj (A string in an indirect object) endobj 
    

    are valid.

    If you look at specific profiles of PDF, though, the situation can differ.

    PDF/A-1 (ISO 19005-1) for example requires:

    The object number and generation number shall be separated by a single white-space character. The generation number and obj keyword shall be separated by a single white-space character.

    The object number and endobj keyword shall each be preceded by an EOL marker. The obj and endobj keywords shall each be followed by an EOL marker.