Search code examples
pythoncwindowsioshared-memory

Python writing into a mapped file - strange behaviour


I have a Python script that can successfully write binary data to a file:

iterable_array = [i + 32 for i in range(50)]
file_handle = open("a.bin", "wb")
bytes_to_write = bytearray(iterable_array)
file_handle.write(bytes_to_write)
file_handle.close()

However, I get the following error:

Traceback (most recent call last):
File "python_100_chars.py", line 20, in <module>
file_handle = open("a.bin", "wb")
OSError: [Errno 22] Invalid argument: 'a.bin'

when I try to write while executing the following program (source originally from Microsoft docs) that creates a file mapping and reads the data after a keypress:

HANDLE hFile = CreateFileA( "a.bin",
                            GENERIC_READ | GENERIC_WRITE,
                            FILE_SHARE_WRITE | FILE_SHARE_READ,
                            NULL,
                            CREATE_ALWAYS,
                            FILE_ATTRIBUTE_NORMAL,
                            NULL);
HANDLE hMapFile;
LPCTSTR pBuf;
hMapFile = CreateFileMapping(
             hFile,    // use paging file
             NULL,                    // default security
             PAGE_EXECUTE_READWRITE,          // read/write access
             0,                       // maximum object size (high-order DWORD)
             BUF_SIZE,                // maximum object size (low-order DWORD)
             szName);                 // name of mapping object

if (hMapFile == NULL)
{
   _tprintf(TEXT("Could not create file mapping object (%d).\n"),
          GetLastError());
   return 1;
}
pBuf = (LPTSTR) MapViewOfFile(hMapFile,   // handle to map object
                     FILE_MAP_ALL_ACCESS, // read/write permission
                     0,
                     0,
                     BUF_SIZE);
if (pBuf == NULL)
{
   _tprintf(TEXT("Could not map view of file (%d).\n"),
          GetLastError());

    CloseHandle(hMapFile);

   return 1;
}


 _getch(); 

 printf("string inside file:%s",(char *)((void *)pBuf));
UnmapViewOfFile(pBuf);

CloseHandle(hMapFile);

I've already tested that I can write into the memory-mapped file (and see the results) with basic I/O in the following way:

HANDLE hFile =  CreateFileA(         "a.bin",
                               GENERIC_WRITE,
          FILE_SHARE_WRITE | FILE_SHARE_READ,
                                        NULL,
                               OPEN_EXISTING,
                      FILE_ATTRIBUTE_NORMAL ,
                                        NULL);
char *p = "bonjour";
DWORD bw;
WriteFile(   hFile,
                 p,
                 8,
               &bw,
               NULL);
  • What is the python script doing that prevents it from writing?

Thank you for any feedback!


Solution

  • I do not have Windows so I cannot quite test the behaviour, but I believe this happens because Windows does not quite follow the POSIX semantics and open(name, 'wb') will, instead of truncating the existing file, open it with CREATE_ALWAYS which would conflict with the file being mapped in another process. The ab mode could work too, but... as the Python documentation says

    'a' for appending (which on some Unix systems, means that all writes append to the end of the file regardless of the current seek position).

    Unfortunately the open function uses C semantics for the flag, and hence it is not possible to specify the desired mode as "open the file for random access writing only without truncating the file", so the best you can do is "open it for both reading and writing without truncating the file", which would be r+b.


    As user Eryk Sun pointed out, the problem is that Windows does not support file truncation at all if there are any existing memory mappings:

    If CreateFileMapping is called to create a file mapping object for hFile, UnmapViewOfFile must be called first to unmap all views and call CloseHandle to close the file mapping object before you can call SetEndOfFile.

    Likewise there is no mapping of space larger than the existing file - if the mapping is longer than the file size, the file is extended instead...


    On POSIXly correct platforms, a program like

    #include <stdio.h>
    #include <stdlib.h>
    #include <sys/types.h>
    #include <sys/stat.h>
    #include <fcntl.h>
    #include <sys/mman.h>
    #include <time.h>
    #include <unistd.h>
    
    
    int main(int argc, char const *argv[])
    {
        struct stat s;
        int fd = open("a.bin", O_RDWR);
        fstat(fd, &s);
        unsigned char *d = mmap(0, s.st_size, PROT_READ|PROT_WRITE, MAP_SHARED, fd, 0);
        getchar();
    
        for (int i = 0; i < s.st_size; i++) {
            putchar(d[i]);
        }
    
        putchar('\n');
        fflush(stdout);
        getchar();
    }
    

    will not interfere with running the said Python program. However if this C program accesses the file within the within the window while it has been truncated to 0, SIGBUS signal will be raised in the C process.