It could be my code is wrongly implemented, but I'm finding that while I can serve up GET requests from literal data, I cannot update that data and have it shown as updated in subsequent GET requests. I also cannot have POST requests update the data.
So it behaves as though somewhere in Python's HTTPServer or BaseHTTPRequestHandler there's caching or forking happening.
Thanks in advance for looking it over, but, gently, no, I do not want to use a non-core 3.8 module or re-write with a wholly different framework or some Flask. I think this should work, but it's misbehaving in a way I can't spot why. If I were using C or Go's built in libraries it'd expect it would not be as much of a head scratcher (for me).
To demonstrate, you'd run the following python implementation, and load http://127.0.0.1:8081/ two or three times:
"""
A Quick test server on 8081.
"""
from http.server import HTTPServer, BaseHTTPRequestHandler
import cgi
import json
import os
import sys
ADDR = '127.0.0.1'
PORT = 8081
def run(server_class=HTTPServer, handler_class=BaseHTTPRequestHandler):
server_address = (ADDR, PORT)
with server_class(server_address, handler_class) as httpd:
print("serving at", ADDR, "on", PORT, f"[ http://{ADDR}:{PORT} ]")
try:
httpd.serve_forever()
except KeyboardInterrupt:
print(" stopping web server due to interrupt signal...")
httpd.socket.close()
class SimpleHandler(BaseHTTPRequestHandler):
"""
Implements responses to GET POST
"""
def __init__(self, request, client_address, server):
"""Sets up the server's memory, a favicon, and one text pseudo-file."""
self.files = {
'/oh': ['text/plain', "It's me", ],
'/favicon.ico': [
'image/svg+xml',
'<svg xmlns="http://www.w3.org/2000/svg" viewBox="0 0 48 48"><text y="1em" font-size="48">⁇</text></svg>',
],
}
self.head = '<link rel="icon" type="image/svg+xml" sizes="48x48" '\
'href="/favicon.ico">'
super(SimpleHandler, self).__init__(request, client_address, server)
def _set_headers(self, content_type='application/json', response=200):
self.send_response(response)
self.send_header("Content-type", content_type)
self.end_headers()
def _html(self, message, title='Simple Server', extra=""):
"""This generates HTML with `message` in the h1 of body."""
content = f"<html><head><title>{title}</title>{self.head}</head>" \
f"<body><h1>{message}</h1>{extra}</body></html>"
return content.encode("utf8") # NOTE: must return a bytes object!
def do_GET(self):
"""Respond to a GET request."""
if self.path == "/":
self._set_headers('text/html')
fnames = [f'<li><a href="{fn}">{fn}</a></li>' for fn in self.files.keys()]
fnames.sort()
self.wfile.write(self._html(
"Welcome",
extra='Try:'
'<ul>'
'<li><a href="/hello">/hello</a></li>'
f'{"".join(fnames)}'
'</ul>'
))
elif self.path == "/hello":
self._set_headers('text/html')
self.wfile.write(self._html("hello you"))
elif self.path in self.files:
content_type, content = self.files[self.path]
self.send_response(200)
self._set_headers(content_type)
self.wfile.write(content.encode())
else:
self.send_error(404)
# Note this update doesn't seem to happen to the in memory dict.
self.files[f"/{len(self.files)}"] = [
"text/html", self._html(len(self.files))]
def do_HEAD(self):
if self.path in ["/", "/hello"]:
self._set_headers('text/html')
elif self.path in self.files:
content_type, _ = self.files[self.path]
self._set_headers(content_type)
else:
self.send_error(404)
def do_POST(self):
"""Should update pseudo-files with posted file contents."""
ctype, pdict = cgi.parse_header(
self.headers.get('content-type', self.headers.get_content_type()))
print("POSTED with content type", ctype)
content = None
if ctype == 'application/x-www-form-urlencoded':
print(" * This multipart/form-data method might not work")
content = {"content": str(self.rfile.read(int(self.headers['Content-Length'])).decode())}
elif ctype == 'multipart/form-data':
print(" * This multipart/form-data method might not work")
fields = cgi.parse_multipart(self.rfile, pdict)
content = {"content": fields.get('content')}
elif ctype == 'application/json':
data_string = self.rfile.read(int(self.headers['Content-Length']))
content = json.loads(data_string)
else:
self.send_error(404)
print(" * Received content:", content)
# Note this update doesn't seem to happen to the in memory dict.
self.files[self.path] = ['application/json', content]
self._set_headers(response=201)
self.wfile.write(json.dumps(content).encode())
if __name__ == '__main__':
print('FYI:')
print(' LANG =', os.getenv('LANG'))
print(' Default Charset Encoding =', sys.getdefaultencoding())
path_to_script = os.path.dirname(os.path.realpath(__file__))
print('Serving from path:', path_to_script)
os.chdir(path_to_script)
run(handler_class=SimpleHandler)
Even before loading http://127.0.0.1:8081/ one could try posting to add something to the self.files
dict. E.G.
curl -v -H 'content-type: application/json' \
--data-binary '{"this": "should work"}' http://127.0.0.1:8081/new_file
And you can see the server respond, and also print the data recieved, which should now be in self.files
and therefore the /
should show it.
You can mix it up with:
curl -v --data-urlencode 'content={"this": "should work"}' http://127.0.0.1:8081/new_file2
But neither of these add a self.files['/new_file']
or '/new_file2'
, and it's just not clear why.
One should be able to request /new_file
or /new_file2
and those instead are 404.
With the last lines in do_GET
, multiple GET
/
requests should show more listed items.
$ curl http://127.0.0.1:8081
<html><head><title>Simple Server</title><link rel="icon" type="image/svg+xml" sizes="48x48" href="/favicon.ico"></head><body><h1>Welcome</h1>Try:<ul><li><a href="/hello">/hello</a></li><li><a href="/favicon.ico">/favicon.ico</a></li><li><a href="/oh">/oh</a></li></ul></body></html>
$ curl http://127.0.0.1:8081
<html><head><title>Simple Server</title><link rel="icon" type="image/svg+xml" sizes="48x48" href="/favicon.ico"></head><body><h1>Welcome</h1>Try:<ul><li><a href="/hello">/hello</a></li><li><a href="/favicon.ico">/favicon.ico</a></li><li><a href="/oh">/oh</a></li></ul></body></html>
While moving those lines that add a new key and value into self.files
to the top of do_GET
shows that it does update, but only one time, which just seems odder still:
$ curl http://127.0.0.1:8081
<html><head><title>Simple Server</title><link rel="icon" type="image/svg+xml" sizes="48x48" href="/favicon.ico"></head><body><h1>Welcome</h1>Try:<ul><li><a href="/hello">/hello</a></li><li><a href="/2">/2</a></li><li><a href="/favicon.ico">/favicon.ico</a></li><li><a href="/oh">/oh</a></li></ul></body></html>
$ curl http://127.0.0.1:8081
<html><head><title>Simple Server</title><link rel="icon" type="image/svg+xml" sizes="48x48" href="/favicon.ico"></head><body><h1>Welcome</h1>Try:<ul><li><a href="/hello">/hello</a></li><li><a href="/2">/2</a></li><li><a href="/favicon.ico">/favicon.ico</a></li><li><a href="/oh">/oh</a></li></ul></body></html>
Okay, it turns out that a new SimpleHandler
is made for each request, therefore I had to move the self.files
out to the outer scope and also be careful what is set up during SimpleHandler
's __init__
. And that basically makes the behavior as I had expected.