Search code examples
c#pdffilestreamreversebinaryreader

How can you read for a specific value from a binary file starting from the end of the file?


I am trying to figure out how to actually (appropriately) read for the PDF trailer Byte_offset_of_last_cross-reference_section from a PDF file.

According to the PDF 1.7 (ISO 32000-1:2008) specification, the file structure is designed in a way that it should be read from the end of the file. Here is an example of what a simplified (minimal) trailer looks like when I use a StreamReader and read the file line-by-line (UTF8 Encoding):

trailer
<< key1 value1
     key2 value2
     …
     keyn valuen
>>
startxref
Byte_offset_of_last_cross-reference_section
%%EOF

trailer
<</Root 7 0 R /Size 7>>
startxref
696
%%EOF

The value I want to somehow grab is the 696 value. I'm just not sure how to do that using a BinaryReader starting from the end of the file.


Solution

  • You can use the Seek method, see here for examples. You can use SeekOrigin.End as argument, see here for other options

    example:

    using (var reader = File.Open(...))
    {
        reader.Seek(100, SeekOrigin.End);
        //...
    }
    

    You can start reading backwards in a loop till you get to the startxref marker (or anything that helps you know that you can read 696) or assume a length of 100 bytes from the end of the file and then do a lookup in that small array as Anthony suggested in the comment below.