Search code examples
pythonlinuxstat

Set the mtime of a file with full microsecond precision in python


Let's say I create a test file and check its mtime:

$ touch testfile.txt 
$ stat testfile.txt
  File: `testfile.txt'
  Size: 0           Blocks: 0          IO Block: 4096   regular empty file
Device: fc01h/64513d    Inode: 3413533     Links: 1
Access: (0664/-rw-rw-r--)  Uid: ( 1000/ me)   Gid: ( 1000/ me)
Access: 2014-09-17 18:38:34.248965866 -0400
Modify: 2014-09-17 18:38:34.248965866 -0400
Change: 2014-09-17 18:38:34.248965866 -0400
 Birth: -

$ date -d '2014-09-17 18:38:34.248965866 -0400' +%s
1410993514

The mtime above is listed with microsecond precision (I realize the system clock resolution makes the higher part of this resolution kind of useless). The utimes(2) system call allows me to pass in the microseconds. However, the os.utime() function seems to combine it into a single number.

I can pass a float like this:

>>> os.utime('testfile.txt', (1410993514.248965866, 1410993514.248965866))

but now

$ stat testfile.txt 
  File: `testfile.txt'
  Size: 0           Blocks: 0          IO Block: 4096   regular empty file
Device: fc01h/64513d    Inode: 3413533     Links: 1
Access: (0664/-rw-rw-r--)  Uid: ( 1000/ me)   Gid: ( 1000/ me)
Access: 2014-09-17 18:38:34.248965000 -0400
Modify: 2014-09-17 18:38:34.248965000 -0400
Change: 2014-09-17 18:46:07.544974140 -0400
 Birth: -

Presumably the precision is lost because the value was converted to a float and python knew better than to trust the last few decimal places.

Is there any way to set the full microseconds field via python?


Solution

  • You already are setting the full microseconds. Micro means millionth; .248965 is 248965 microseconds. .248965866 is 248965866 nano seconds.

    Of course it's also 248965.866 microseconds, but the portable APIs that Python uses to set times on every platform but Windows only accept integral microseconds, not fractional. (And, in fact, POSIX doesn't require a system to remember anything smaller than microseconds.)

    As of Python 3.3, os.utime adds an ns keyword argument, on systems that support a way to set nanoseconds.1,2 So, you can pass integers for the times, and then pass the nanoseconds in a separate argument. Like this:

    >>> os.utime('testfile.txt', (1410993514, 1410993514), ns=(248965866, 248965866))
    

    One last thing:

    Presumably the precision is lost because the value was converted to a float and python knew better than to trust the last few decimal places.

    That actually might make sense… but Python doesn't do that. You can see the exact code it uses here, but basically, the only compensation they make for rounding is ensuring that negative microseconds become 0.3

    But you're right that rounding errors are a potential problem here… which is why both *nix and Python avoid the problem by using separate seconds and nanoseconds integers (and Windows solves it by using a 64-bit int instead of a double).


    1 If you're on Unix, that means you have a utimens function that's like utimes but takes struct timespec instead of struct timeval. You should have it on any non-ancient linux/glibc system; on *BSD it depends on the kernel, but I think everything but OS X has it nowadays; otherwise you probably don't have it. But the easiest way to check is just man utimens.

    2 On Windows, Python uses native Win32 APIs that deal in 100ns units, so you get only one extra digit this way, not three.

    3 I linked to 3.2, because 3.3 is a bit harder to follow, partly because of the ns support that you care about, but mostly because of the at support that you don't.