Search code examples
pythonioblocking

How to Block-Read a File with Python


I'm rather new in this Python-area and want to read from a file that is written by another programme while that is running. So my script should read a line as soon as it is written by the other programme.

Here is what I have:

#!/usr/bin/env python

import datetime
import os
import select
import sys

FILENAME = "/home/sjngm/coding/source/python/text.log"
with open(FILENAME, "r", encoding = "utf-8", errors = "ignore") as log:
    print("blocks: " + str(os.get_blocking(log.fileno())) + " / fd: " + str(log.fileno()) + " / " + str(log))
    while True:
        os.pread(log.fileno(), 1, 0)
        sel = select.select([log], [], [], 60000.0) #[0]
        line = log.readline().replace("\n", "")
        if line:
            print(line)
        else:
            print("-" + str(datetime.datetime.now()), end = "\r")

        # do something interesting with line...

text.log (for now it's just a regular text file and no other process accesses it):

line 1
line 2
line 3

It doesn't matter if there is a \n at the end of the last line or not

Output:

[sjngm@runlikehell ~]$ python ~/coding/source/python/test.py 
blocks: True / fd: 3 / <_io.TextIOWrapper name='/home/sjngm/coding/source/python/text.log' mode='r' encoding='utf-8'>
line 1
line 2
line 3
^CTraceback (most recent call last):
  File "/home/sjngm/coding/source/python/test.py", line 16, in <module>
    line = log.readline().replace("\n", "")
  File "/usr/lib/python3.6/codecs.py", line 321, in decode
    (result, consumed) = self._buffer_decode(data, self.errors, final)
KeyboardInterrupt
[sjngm@runlikehell ~]$ uname -a
Linux runlikehell 4.14.53-1-MANJARO #1 SMP PREEMPT Tue Jul 3 17:59:17 UTC 2018 x86_64 GNU/Linux
[sjngm@runlikehell ~]$ 

So it says that blocking is enabled. After printing the three lines the script keeps going and prints the current time constantly without any pause.

It actually should pause at pread(), select() or readline(). Or as a matter of fact at any other command that I just don't know of.

How do I make this work?

Note that I don't want to pipe the file to the script as I want to use curses later on and its getch() wouldn't work in such a scenario.


Solution

  • Seems like this isn't a common situation. What I'm doing now is this:

    import subprocess
    
    with subprocess.Popen([ "tail", "-10000f", FILENAME ], encoding = "utf-8", errors = "ignore", universal_newlines = True, bufsize = 1, stdout = subprocess.PIPE).stdout as log:
        line = log.readline()
    

    In other words I'm opening the pipe in the script rather than pipeing something to the script. The buffering seems to be done in tail in connection with Popen's parameter bufsize. encoding and universal_newlines allow readline() to read a string rather than a byte-array (see Python's documentation for more info on that).

    System's stdin is now still available and curses works nicely with keyboard/mouse events.