I am writing a guest OS on top of Linux (Ubuntu distribution) within a Docker container. The filesystem is implemented as a single file resting inside the host OS, so anytime a file is changed in the guest OS filesystem, the file on the host OS must be opened, the correct block(s) must be overwritten, and the file must be closed.
My partner and I have developed the following recursive helper function to take in a block number and offset to abstract away all details at the block-level for higher level functions:
/**
* Recursive procedure to write n bytes from buf to the
* block specified by block_num. Also updates FAT to
* reflect changes.
*
* @param block_num identifier for block to begin writing
* @param buf buffer to write from
* @param n number of bytes to write
* @param offset number of bytes to start writing from as
* measured from start of file
*
* @returns number of bytes written
*/
int write_bytes(int block_num, const char *buf, int n, int offset) {
BlockTuple red_tup = reduce_block_offset(block_num, offset);
block_num = red_tup.block;
offset = red_tup.offset;
FILE *fp = fopen(fat->fname, "r+");
int bytes_to_write = min(n, fat->block_size - offset);
int write_n = max(bytes_to_write, 0);
fseek(fp, get_block_start(block_num) + offset, SEEK_SET);
fwrite(buf, 1, write_n, fp); // This line is returning 48 bytes written
fclose(fp);
// Check if there are bits remaining
int bytes_left = n - write_n;
if (bytes_left > 0) {
// Recursively write on next block
int next_block = get_free_block();
set_fat_entry(block_num, next_block); // point block to next block
set_fat_entry(next_block, 0xFFFF);
return write_bytes(next_block, buf + write_n, bytes_left, max(0, offset - fat->block_size)) + write_n;
} else {
set_fat_entry(block_num, 0xFFFF); // mark file as terminated
return write_n;
}
}
The issue is that fwrite(3) is reporting 48 bytes written (when n is passed as 48) but hexdumping the file on the host OS reveals no bytes have been changed:
00000000 00 01 ff ff ff ff 00 00 00 00 00 00 00 00 00 00 |................|
00000010 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 |................|
*
00008000
This is particularly wacky because when my partner runs the code on the exact same commit (with no uncommitted changes), her write goes through and the file on the host OS hexdumps to:
00000000 00 01 ff ff ff ff 00 00 00 00 00 00 00 00 00 00 |................|
00000010 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 |................|
*
00000100 66 31 00 00 00 00 00 00 00 00 00 00 00 00 00 00 |f1..............|
00000110 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 |................|
00000120 00 01 00 00 02 00 01 06 e7 36 75 63 00 00 00 00 |.........6uc....|
00000130 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 |................|
*
00000200 e0 53 f8 88 c0 0d 37 ca 84 1f 19 b0 6c a8 68 7b |.S....7.....l.h{|
00000210 57 83 cf 13 f0 42 21 d3 21 e1 da de d4 8a f1 e6 |W....B!.!.......|
00000220 f0 12 98 fb 1c 30 4c 04 b3 16 1d 96 17 ba d7 5a |.....0L........Z|
00000230 7e f3 8a f5 6a 42 6b ef 58 f6 bc 01 db 0c 02 53 |~...jBk.X......S|
00000240 e5 10 7e f3 4a d5 3f ac 8e 38 82 c3 95 f8 11 8e |..~.J.?..8......|
00000250 a6 82 eb 3b 24 56 9a 75 44 36 8b 25 60 83 4c 04 |...;$V.uD6.%`.L.|
00000260 07 9e 14 99 9c 9f 87 3c 8a d4 c3 e8 17 60 81 0e |.......<.....`..|
00000270 bc eb 1d 35 68 fc d5 be 4f 1c 9d 5e 72 57 65 01 |...5h...O..^rWe.|
00000280 b7 43 54 26 d6 6d ba 51 bf 12 8c a1 03 d5 66 b3 |.CT&.m.Q......f.|
00000290 90 0d 60 b8 95 8d 15 bd 53 9a 70 77 4f 7a 04 1e |..`.....S.pwOz..|
000002a0 9e b2 4c 9a 79 dd de 48 cd fe 1e dc 57 7d d1 7f |..L.y..H....W}..|
000002b0 3f f5 77 96 fa e7 d7 33 33 48 ce 0a 4d 61 ab 96 |?.w....33H..Ma..|
000002c0 5f c4 88 bf c6 3a 09 37 76 c4 b8 db bc 6a 7d c0 |_....:.7v....j}.|
000002d0 c4 89 68 e7 b4 70 f8 a6 a8 00 9d c4 63 da fb 66 |..h..p......c..f|
000002e0 be d2 cd 68 1c d2 ff bf 00 e9 37 ab 6b 1a 3c f2 |...h......7.k.<.|
000002f0 7b c1 a2 c4 46 ae db 93 b4 4f 64 79 14 2a 1a d4 |{...F....Ody.*..|
00000300 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 |................|
*
00008000
The 48 bytes I'm referring to that don't get written are the bytes written to the directory block running from address 00000100-0000012E (the bytes below that represent the actual file being written, the code seg faults on my end before reaching that write). It's worth noting my container can still format the filesystem file, so all writes aren't broken. This snippet just represents the first write that did not work.
We are both running the code in an identical Docker container. The only difference I could imagine is my computer is a Windows and hers is a Mac. What could possibly be the issue?
The very first thing I believed was that there was some conflict with the host OS that blocked my write, but assigning and printing the return value of fwrite(3)
returned that 48 bytes were indeed written on both machines.
I was also expecting that my buffer was simply all 0s (it is initially allocated using calloc(3)), but printing out the first 48 bytes of the buffer proved that theory false.
I finally considered that this was some issue with the higher level interface in <stdio.h> instead of the lower level one in <unistd.h>. I replaced fopen(3), fwrite(3), flseek(3), fclose(3)
each with their lower-level equivalents (write(2)
etc) and it still turned up 48 bytes written with no actual change to the files.
EDIT: The guest OS filesystem can be formatted with respect to user parameters. All testing above was performed with a block size of 256 bytes and 128 blocks total. I've attempted the exact same write sequence again with a block size of 1024 bytes and 16384 blocks total, and there was no error. It's still unclear why the code works on my partner's machine for both format configs and not mine, but this may narrow it down.
Running strace
reveals the following excerpt around the write:
openat(AT_FDCWD, "minfs", O_RDWR) = 4
newfstatat(4, "", {st_mode=S_IFREG|0777, st_size=32768, ...}, AT_EMPTY_PATH) = 0
lseek(4, 0, SEEK_SET) = 0
read(4, "\0\1\377\377\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., 256) = 256
write(4, "f1\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., 48) = 48
close(4)
It again appears the bytes get written, but a hd
after the program finishes reveals the same output as above. My thought was perhaps the bytes written in the excerpt are overwritten later on, but the only write after the excerpt above in the strace
is:
lseek(4, 0, SEEK_SET) = 0
read(4, "\0\1\377\377\377\377\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., 320) = 320
write(4, "\0", 1) = 1
close(4)
which should be at address 320, squarely after the write at address 256 above.
It turns out the mismatch was due to undefined behavior concerning when changes were synchronized in mmap(2)
. There was a section of code where a region of memory mapped via mmap(2)
was changed and then immediately followed by reads/writes to the file on the host OS containing the mapped region of memory. It seems the Mac would write through the changes before the following section while the Windows wouldn't synchronize until after the fact, resulting in the undefined behavior.
The problem was fixed by making a call to msync(2)
immediately after modifying the mapped region from mmap(2)
with the MS_SYNC
flag forcing the write-through behavior.