Search code examples
linuxlinux-kernelfilesystemsext4

Is overwriting a small file atomic on ext4?


Assume we have a file of FILE_SIZE bytes, and:

  • FILE_SIZE <= min(page_size, physical_block_size);
  • file size never changes (i.e. truncate() or append write() are never performed);
  • file is modified only by completly overwriting its contents using:

    pwrite(fd, buf, FILE_SIZE, 0);
    

Is it guaranteed on ext4 that:

  1. Such writes are atomic with respect to concurrent reads?
  2. Such writes are transactional with respect to a system crash?

    (i.e., after a crash the file's contents is completely from some previous write and we'll never see a partial write or empty file)

Is the second true:

  • with data=ordered?
  • with data=journal or alternatively with journaling enabled for a single file?

    (using ioctl(fd, EXT4_IOC_SETFLAGS, EXT4_JOURNAL_DATA_FL))

  • when physical_block_size < FILE_SIZE <= page_size?


I've found related question which links discussion from 2011. However:

  • I didn't find an explicit answer for my question 2.
  • I wonder, if the above is true, is it documented somewhere?

Solution

  • From my experiment it was not atomic.

    Basically my experiment was to have two processes, one writer and one reader. The writer writes to a file in a loop and reader reads from the file

    Writer Process:

    char buf[][18] = {
        "xxxxxxxxxxxxxxxx",
        "yyyyyyyyyyyyyyyy"
    };
    i = 0;
    while (1) {
       pwrite(fd, buf[i], 18, 0);
       i = (i + 1) % 2;
    }
    

    Reader Process

    while(1) {
        pread(fd, readbuf, 18, 0);
        //check if readbuf is either buf[0] or buf[1]
    }
    

    After a while of running both processes, I could see that the readbuf is either xxxxxxxxxxxxxxxxyy or yyyyyyyyyyyyyyyyxx.

    So it definitively shows that the writes are not atomic. In my case 16byte writes were always atomic.

    The answer was: POSIX doesn't mandate atomicity for writes/reads except for pipes. The 16 byte atomicity that I saw was kernel specific and may/can change in future.

    Details of the answer in the actual post: write(2)/read(2) atomicity between processes in linux