Search code examples
cfileparsingtemporary-files

Fragmenting and Un-fragmenting files in C


I wanted to take a file (text or binary) and fragment it into small pieces of a certain size (about 250-500kB), randomize the order of the fragments, and put it into another temporary fragmented file.

The un-fragmenting would then take the fragmented file, extract the pieces, put them in order and allow the original file to be intact.

This would be very easy for simple text-based ASCII files as you could use the C library functions (like sscanf) for formating/parsing the information. The one file could have a format then like

(#### <fragment #> <fragment> ...)

However, I am not sure how one would do something like that with binary files.

I know one easy solution is to use separate files for the fragments like <.part1, .part2> files but this would be a bit ugly and wouldn't scale well to much larger files. It would be a lot better to just store it in one file.

Thanks a lot.


Solution

  • Try to use binary data only. In you fragmented file, follow the structure:

    OFFSET SIZE  DESCRIPTION
         0    4  BLOCK NUMBER
         4    4  BLOCK SIZE IN BYTES
         8    ?  BLOCK DATA
    

    Define a header structure:

    typedef struct hdr
    {
        uint32_t number;
        uint32_t size;
    } hdr_t;
    

    Code to work with it can look like:

    void file_append(FILE *f, size_t block, size_t size, const void *data)
    {
        hdr_t hdr;
        hdr.number = block;
        hdr.size = size;
        fwrite(&hdr, sizeof(hdr), 1, f);
        fwrite(data, size, 1, f);
    }
    

    And reading the data:

    void file_read_chunk(FILE *f, size_t *block, size_t *size, void **data)
    {
        hdr_t hdr;
    
        fread(&hdr, sizeof(hdr), f);
        *block = hdr.number;
        *size = hdr.size;
        *data = malloc(hdr.size);
        fread(*data, hdr.size, 1, f);
    }