I'm trying to map a file from local disk into memory so my program can access the file content. When mmap is called on a file (just under 100kB in size), I view the memory in the debugger starting at the address returned by mmap, and the memory content does not match the file content (both viewed in Hexadecimal). This is not a byte-swapping issue. Only the first 2 bytes in memory and the actual file match and the rest of the content do not.
When I repeat the same thing on a small file containing a string (ex: "hello world"), then the memory as viewed in the debugger matches exactly the content of the file (again viewed in Hex).
I tried using MAP_PRIVATE instead of MAP_SHARED but same result. How can I get this to work with my bigger file?
I'm working in Ubuntu 17.10 with Eclipse 4.7.2 + CDT and debugging with GDB.
#include <iostream>
#include <sys/types.h>
#include <sys/stat.h>
#include <fcntl.h>
#include <string>
#include <unistd.h>
#include <sys/mman.h>
int main()
{
void* MapAddr = NULL;
char* pData = NULL;
struct stat FileProps;
int FileDes = 0;
const char* fileNameAndPath = "/home/Test/testfile.txt";
FileDes = open(fileNameAndPath, O_RDWR);
if (FileDes != -1)
{
if (fstat(FileDes, &FileProps) == 0)
{
MapAddr = mmap(NULL, FileProps.st_size, (PROT_READ | PROT_WRITE), MAP_SHARED, FileDes, 0);
if (MapAddr == (void*) -1)
{
std::cout << "init: mmap failed" << std::endl;
return 0;
}
}
}
pData = (char*) MapAddr;
std::cout << pData << std::endl;
return 0;
}
13:10:42 **** Build of configuration Debug for project mmapTest **** make all Building file: ../src/mmapTest.cpp Invoking: GCC C++ Compiler g++ -O0 -g3 -Wall -c -fmessage-length=0 -MMD -MP -MF"src/mmapTest.d" -MT"src/mmapTest.o" -o "src/mmapTest.o" "../src/mmapTest.cpp" Finished building: ../src/mmapTest.cpp
Building target: mmapTest
Invoking: GCC C++ Linker
g++ -o "mmapTest" ./src/mmapTest.o
Finished building target: mmapTest
13:10:46 Build Finished (took 4s.438ms)
I determined the reason that the memory content (as viewed in the GDB debugger) did not match the actual file content, after mmap() was run, is that the file that I mapped was not ANSI-encoded. So the debugger was showing the correct data as Linux thought it should be. After saving the file (in textpad) with ANSI format, the binary content of the file as viewed in Linux was the same as the binary content as viewed in Windows. No code change was needed. The problem was with the file being mapped. And contrary to one of the comments, the GDB debugger IS ABLE to show ALL the mapped file data with the memory viewer at the address returned by mmap() - I've confirmed this for files up to 100KB.