Consider the following MCVE, that runs a python program that:
FIXME
set during startup from the first command line argumentselect
on the self.rfile.fileno()
requests
module.#!/usr/bin/env python
import os
import select
import sys
import threading
import time
from http.server import BaseHTTPRequestHandler, ThreadingHTTPServer
from typing import Iterator
import requests
ip = "127.0.0.1"
port = 10000
FIXME = True
class MyServer(BaseHTTPRequestHandler):
def do_PUT(self):
inputfd = self.rfile.fileno()
os.set_blocking(inputfd, False)
while 1:
print("SERVER SELECT")
sel = select.select([inputfd], [], [inputfd])
if inputfd in sel[2]:
print(f"SERVER {inputfd} errored")
break
if inputfd not in sel[0]:
print(f"SERVER read file descriptor closed")
break
chunk: bytes = os.read(inputfd, 8192)
print(f"SERVER RECV {len(chunk)} {chunk!r}")
if len(chunk) == 0:
print(f"SERVER len(chunk) == 0")
break
if chunk:
print(f"{chunk!r}")
if b"0\r\n\r\n" in chunk:
# terminating chunk
break
self.send_response(200)
self.end_headers()
def server():
with ThreadingHTTPServer((ip, int(port)), MyServer) as webServer:
webServer.serve_forever()
class Client:
def get_chunks_to_write(self) -> Iterator[bytes]:
if FIXME:
time.sleep(0.1)
print(f"CLIENT write")
yield b"123"
def main(self):
requests.put(f"http://{ip}:{port}", data=self.get_chunks_to_write())
def cli():
threading.Thread(target=server, daemon=True).start()
# Wait for server startup
time.sleep(0.5)
Client().main()
if __name__ == "__main__":
FIXME = int(sys.argv[1])
cli()
When FIXME is set, the execution is fine:
$ ./test.py 1
SERVER SELECT
CLIENT write
SERVER RECV 13 b'3\r\n123\r\n0\r\n\r\n'
b'3\r\n123\r\n0\r\n\r\n'
127.0.0.1 - - [27/Nov/2023 11:14:01] "PUT / HTTP/1.1" 200 -
However, when FIXME
is false, then the execution blocks and the bytes are lost (takes about ~10 tries to reproduce with this MCVE, happens every time on real program):
$ ./1.py 0
CLIENT write
SERVER SELECT
What can I do to remove the extra sleep? What synchronization is missing? What is happening?
I think this happens, because the client closes or transfers bytes before the server can enter select
. But I do not know how could that influence anything, as select
should still allow processing the remaining bytes on the socket.
I tried searching the net. Setting set_blocking(inputfd, True)
and os.read(inputfd, 1)
also does not read the transferred bytes - I assume this is, as if, they were already read by BaseHTTPRequestHandler
. How can I access them?
select.select([inputfd]
You are right - python reads from the socket first and buffers the data. select
selects on the underlying socket, however, python is buffering read data within self.rfile
. Do not use select.select
and os.read()
on raw socket, instead use python wrappers with blocking read like self.rfile.read()
and self.rfile.readline()
which use the buffers managed by python.