Search code examples
cdockermacoswindows-subsystem-for-linuxstdio

C program in Docker: fwrite(3) and write(2) fail to modify files on Windows but not on MacOS


I am writing a guest OS on top of Linux (Ubuntu distribution) within a Docker container. The filesystem is implemented as a single file resting inside the host OS, so anytime a file is changed in the guest OS filesystem, the file on the host OS must be opened, the correct block(s) must be overwritten, and the file must be closed.

My partner and I have developed the following recursive helper function to take in a block number and offset to abstract away all details at the block-level for higher level functions:

/**
 * Recursive procedure to write n bytes from buf to the
 * block specified by block_num. Also updates FAT to
 * reflect changes.
 * 
 * @param block_num identifier for block to begin writing
 * @param buf buffer to write from
 * @param n number of bytes to write
 * @param offset number of bytes to start writing from as
 *               measured from start of file
 * 
 * @returns number of bytes written
*/
int write_bytes(int block_num, const char *buf, int n, int offset) {
    BlockTuple red_tup = reduce_block_offset(block_num, offset);
    block_num = red_tup.block;
    offset = red_tup.offset;
    FILE *fp = fopen(fat->fname, "r+");

    int bytes_to_write = min(n, fat->block_size - offset);
    int write_n = max(bytes_to_write, 0);
    fseek(fp, get_block_start(block_num) + offset, SEEK_SET);
    fwrite(buf, 1, write_n, fp); // This line is returning 48 bytes written
    fclose(fp);

    // Check if there are bits remaining
    int bytes_left = n - write_n;
    if (bytes_left > 0) {
        // Recursively write on next block
        int next_block = get_free_block();
        set_fat_entry(block_num, next_block); // point block to next block
        set_fat_entry(next_block, 0xFFFF);
        return write_bytes(next_block, buf + write_n, bytes_left, max(0, offset - fat->block_size)) + write_n;
    } else {
        set_fat_entry(block_num, 0xFFFF); // mark file as terminated
        return write_n;
    }
}

The issue is that fwrite(3) is reporting 48 bytes written (when n is passed as 48) but hexdumping the file on the host OS reveals no bytes have been changed:

00000000  00 01 ff ff ff ff 00 00  00 00 00 00 00 00 00 00  |................|
00000010  00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00  |................|
*
00008000

This is particularly wacky because when my partner runs the code on the exact same commit (with no uncommitted changes), her write goes through and the file on the host OS hexdumps to:

00000000  00 01 ff ff ff ff 00 00  00 00 00 00 00 00 00 00  |................|
00000010  00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00  |................|
*
00000100  66 31 00 00 00 00 00 00  00 00 00 00 00 00 00 00  |f1..............|
00000110  00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00  |................|
00000120  00 01 00 00 02 00 01 06  e7 36 75 63 00 00 00 00  |.........6uc....|
00000130  00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00  |................|
*
00000200  e0 53 f8 88 c0 0d 37 ca  84 1f 19 b0 6c a8 68 7b  |.S....7.....l.h{|
00000210  57 83 cf 13 f0 42 21 d3  21 e1 da de d4 8a f1 e6  |W....B!.!.......|
00000220  f0 12 98 fb 1c 30 4c 04  b3 16 1d 96 17 ba d7 5a  |.....0L........Z|
00000230  7e f3 8a f5 6a 42 6b ef  58 f6 bc 01 db 0c 02 53  |~...jBk.X......S|
00000240  e5 10 7e f3 4a d5 3f ac  8e 38 82 c3 95 f8 11 8e  |..~.J.?..8......|
00000250  a6 82 eb 3b 24 56 9a 75  44 36 8b 25 60 83 4c 04  |...;$V.uD6.%`.L.|
00000260  07 9e 14 99 9c 9f 87 3c  8a d4 c3 e8 17 60 81 0e  |.......<.....`..|
00000270  bc eb 1d 35 68 fc d5 be  4f 1c 9d 5e 72 57 65 01  |...5h...O..^rWe.|
00000280  b7 43 54 26 d6 6d ba 51  bf 12 8c a1 03 d5 66 b3  |.CT&.m.Q......f.|
00000290  90 0d 60 b8 95 8d 15 bd  53 9a 70 77 4f 7a 04 1e  |..`.....S.pwOz..|
000002a0  9e b2 4c 9a 79 dd de 48  cd fe 1e dc 57 7d d1 7f  |..L.y..H....W}..|
000002b0  3f f5 77 96 fa e7 d7 33  33 48 ce 0a 4d 61 ab 96  |?.w....33H..Ma..|
000002c0  5f c4 88 bf c6 3a 09 37  76 c4 b8 db bc 6a 7d c0  |_....:.7v....j}.|
000002d0  c4 89 68 e7 b4 70 f8 a6  a8 00 9d c4 63 da fb 66  |..h..p......c..f|
000002e0  be d2 cd 68 1c d2 ff bf  00 e9 37 ab 6b 1a 3c f2  |...h......7.k.<.|
000002f0  7b c1 a2 c4 46 ae db 93  b4 4f 64 79 14 2a 1a d4  |{...F....Ody.*..|
00000300  00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00  |................|
*
00008000

The 48 bytes I'm referring to that don't get written are the bytes written to the directory block running from address 00000100-0000012E (the bytes below that represent the actual file being written, the code seg faults on my end before reaching that write). It's worth noting my container can still format the filesystem file, so all writes aren't broken. This snippet just represents the first write that did not work.

We are both running the code in an identical Docker container. The only difference I could imagine is my computer is a Windows and hers is a Mac. What could possibly be the issue?

The very first thing I believed was that there was some conflict with the host OS that blocked my write, but assigning and printing the return value of fwrite(3) returned that 48 bytes were indeed written on both machines.

I was also expecting that my buffer was simply all 0s (it is initially allocated using calloc(3)), but printing out the first 48 bytes of the buffer proved that theory false.

I finally considered that this was some issue with the higher level interface in <stdio.h> instead of the lower level one in <unistd.h>. I replaced fopen(3), fwrite(3), flseek(3), fclose(3) each with their lower-level equivalents (write(2) etc) and it still turned up 48 bytes written with no actual change to the files.

EDIT: The guest OS filesystem can be formatted with respect to user parameters. All testing above was performed with a block size of 256 bytes and 128 blocks total. I've attempted the exact same write sequence again with a block size of 1024 bytes and 16384 blocks total, and there was no error. It's still unclear why the code works on my partner's machine for both format configs and not mine, but this may narrow it down.

Running strace reveals the following excerpt around the write:

openat(AT_FDCWD, "minfs", O_RDWR)       = 4
newfstatat(4, "", {st_mode=S_IFREG|0777, st_size=32768, ...}, AT_EMPTY_PATH) = 0
lseek(4, 0, SEEK_SET)                   = 0
read(4, "\0\1\377\377\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., 256) = 256
write(4, "f1\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., 48) = 48
close(4) 

It again appears the bytes get written, but a hd after the program finishes reveals the same output as above. My thought was perhaps the bytes written in the excerpt are overwritten later on, but the only write after the excerpt above in the strace is:

lseek(4, 0, SEEK_SET)                   = 0
read(4, "\0\1\377\377\377\377\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., 320) = 320
write(4, "\0", 1)                       = 1
close(4)

which should be at address 320, squarely after the write at address 256 above.


Solution

  • It turns out the mismatch was due to undefined behavior concerning when changes were synchronized in mmap(2). There was a section of code where a region of memory mapped via mmap(2) was changed and then immediately followed by reads/writes to the file on the host OS containing the mapped region of memory. It seems the Mac would write through the changes before the following section while the Windows wouldn't synchronize until after the fact, resulting in the undefined behavior.

    The problem was fixed by making a call to msync(2) immediately after modifying the mapped region from mmap(2) with the MS_SYNC flag forcing the write-through behavior.

    Links to documentation here: mmap(2), msync(2).