Search code examples
cfile-iommapbus-error

copy whole of a file into memory using mmap


i want to copy whole of a file to memory using mmap in C.i write this code:

#include <stdio.h>
#include <stdlib.h>
#include <fcntl.h>
#include <unistd.h>
#include <sys/types.h>
#include <sys/mman.h>
#include <sys/stat.h>
#include <errno.h>
int main(int arg, char *argv[])
{
    char c ;
    int numOfWs = 0 ;
    int numOfPr = 0 ;
    int numberOfCharacters ;
    int i=0;
    int k;
    int pageSize = getpagesize();
    char *data;
    float wsP = 0;
    float prP = 0;
    int fp = open("2.txt", O_RDWR);
    data = mmap((caddr_t)0, pageSize, PROT_READ, MAP_SHARED, fp,pageSize);
    printf("%s\n", data);
    exit(0);
}

when i execute the code i get the Bus error message. next, i want to iterate this copied file and do some thing on it. how can i copy the file correctly?


Solution

  • 2 things.

    1. The second parameter of mmap() is the size of the portion of file you want to make visible in your address space. The last one is the offset in the file from which you want the map. This means that as you have called mmap() you will see only 1 page (on x86 and ARM it's 4096 bytes) starting at offset 4096 in your file. If your file is smaller than 4096 bytes, then there will be no mapping and mmap() will return MAP_FAILED (i.e. (caddr_t)-1). You didn't check the return value of the function so the following printf() dereferences an illegal pointer => BUS ERROR.
    2. Using a memory map with string functions can be difficult. If the file doesn't contain binary 0. It can happen that these functions then try to access past the mapped size of the file and touch unmapped memory => SEGFAULT.

    To open a memory for a file, you have to know the size of the file.

    struct stat filestat;
    
    if(fstat(fd, &filestat) !=0) {
       perror("stat failed");
       exit(1);
    }
    
    data = mmap(NULL, filestat.st_size, PROT_READ, MAP_SHARED, fp, 0);
    if(data == MAP_FAILED) {
       perror("mmap failed");
       exit(2);
    }
    

    EDIT: The memory map will always be opened with a size that is a multiple of the pagesize. This means that the last page will be filled with 0 up to the next multiple of the pagesize. Often programs using memory mapped files with string functions (like your printf()) will work most of the time, but will suddenly crash when mapping a file whith a size exactly a multiple of the page size (4096, 8192, 12288 etc.). The often seen advice to pass to mmap() a size bigger than real file size works on Linux but is not portable and is even in violation of Posix, which explicitly states that mapping beyond the file size is undefined behaviour. The only portable way is to not use string functions on memory maps.