What is the OS-level handle of tempfile.mkstemp good for?

I use tempfile.mkstemp when I need to create files in a directory which might stay, but I don't care about the filename. It should only be something that doesn't exist so far and have a prefix- and a suffix.

One part about the documentation that I ignored so far is

mkstemp() returns a tuple containing an OS-level handle to an open file (as would be returned by os.open()) and the absolute pathname of that file, in that order.

What is the OS-level handle and how should one use it?

Background

I always used it like this:

from tempfile import mstemp

_, path = mkstemp(prefix=prefix, suffix=suffix, dir=dir)

with open(path, "w") as f:
    f.write(data)

# do something

os.remove(path)

It worked fine so far. However, today I wrote a small script which generates huge files and deletes them. The script aborted the execution with the message

OSError: [Errno 28] No space left on device

When I checked, there were 80 GB free.

My suspicion is that os.remove only "marked" the files for deletion, but the files were not properly removed. And the next suspicion was that I might need to close the OS-level handle before the OS can actually free that disk space.

Solution

Your suspicion is correct. The os.remove only removes the directory entry that contains the name of the file. However, the file data remains intact and continues to consume space on the disk until the last open descriptor on the file is closed. During that time normal operations on the file through existing descriptors continue to work, which means you could still use the _ descriptor to seek in, read from, or write to the file after os.remove has returned.

In fact it's common practice to immediately os.remove the file before moving on to using the descriptor to operate on the file contents. This prevents the file from being opened by any other process, and also means that the file won't be left hanging around if this program dies unexpectedly before reaching a later os.remove.

Of course that only works if you're willing and able to use the low-level descriptor for all of your operations on the file, or if you use the os.fdopen method to construct a file object on top of the descriptor and use that new object for all operations. Obviously you only want to do one of those things; mixing descriptor access and file-object access to the same underlying file can produce unexpected results.

os.fdopen(_) should execute faster than open(path) but it doesn't have the context manager integration that open has, so it's not directly usable in a with construct. I think you can use contextlib.closing to get around that.