Search code examples
c++file-handlingclass-design

Designing file reader for multiple file formats/headers


I'm trying to design a Reader class, that would be able to read multiple file types (mostly binary representation of something). To get all the metadata from the file, it's going to use Header class which will somehow tell the Reader class size of the file header, (maybe through static field, idk) num bytes to read and pass to a header at the first read. Header class will parse a header accordingly, and will provide some abstract methods like std::size_t getChunkSize(). Then, header class will tell the reader number of bytes for each chunk of data. That what i have in mind. How can i properly represent this hierarchy or there is a better way to do this?


Solution

  • Generally implementing a generic Reader class for binary formats can be quite challenging. Since some file formats have variable length headers I would suggest to put the reading of the header completely into the Header class. Something like:

    struct Header {
     virtual size_t parse(uint8_t* buf, size_t len) = 0;
     virtual int getNumChunks() const = 0;
     virtual size_t getChunkSize(int chunk) = 0;
    }
    

    then in your Reader class you can have a read method which first parses the Header and then reads the rest of the file based on the chunk size. An offset method to get the start of the chunk would also be helpful for some file formats. Depending on the file formats you want to parse there may also be problems of actually determining the chunk size if there are no chunk headers and you have variable length arrays in those chunks for example. As such it would maybe be a better idea to also implement a Chunk class to read all the chunks.