I was wondering if the Path.write_text(data)
function from pathlib was atomic or not.
If not, are there scenarios where we could end up with a file created in the filesystem but not containing the intended content?
To be more specific, as the comment from @ShadowRanger suggested what I care about is to know if the file contains either the original data or the new data, but never something in between. Which is actually less as full atomicity.
No, it does not do any tricks with opening a temp file in the same directory, populating it, and finishing with an atomic rename to replace the original file. The current implementation is guaranteed to be at least two unique operations:
If nothing else, your code could die after step 1 and before step 2 (a badly timed Ctrl-C or power loss), and the original data would be gone, and no new data would be written.
The question is kinda nonsensical on its face. It doesn't really matter if it's atomic; even if it was atomic, a nanosecond after the write occurs, some other process could open the file, truncate it, rewrite it, move it, etc. Heck, in between write_text
opening the file and when it writes the data, some other process could swoop in and move/rename the newly opened file or delete it; the open handle write_text
holds would still work when it writes a nanosecond later, but the data would never be seen in a file at the provided path (and might disappear the instant write_text
closes it, if some other process swooped in and deleted it).
Beyond that, it can't be atomic even while writing, in any portable sense. Two processes could have the same file open at once, and their writes can interleave (there are locks around the standard handles within a process to prevent this, but no such locks exist to coordinate with an arbitrary other process). Concurrent file I/O is hard; avoid it if at all possible.