Search code examples
pythonserversimplehttpserver

How does python's SimpleHTTPServer do_GET and do_POST functions work?


I've created the following little HTTP server for learning purposes:

import SimpleHTTPServer
import SocketServer

class ServerHandler(SimpleHTTPServer.SimpleHTTPRequestHandler):

    def do_GET(self):
        print(self.headers)
        SimpleHTTPServer.SimpleHTTPRequestHandler.do_GET(self)

    def do_POST(self):
        print(self.headers)
        form = cgi.FieldStorage(
            fp=self.rfile,
            headers=self.headers,
            environ={'REQUEST_METHOD':'POST',
                 'CONTENT_TYPE':self.headers['Content-Type'],

def main():
    port = 50738
    Handler = ServerHandler(1000)
    httpd = SocketServer.TCPServer(("192.168.X.Y", port), Handler)

    print "serving at port", port
    httpd.serve_forever()


if __name__ == "__main__":
    main()

My assumptions are as follows:

  • My class 'ServerHandler' extends the SimpleHTTPServer.SimpleHTTPRequestHandler class by two functions namely go_GET and do_POST
  • The main() function creates a server handler object and server socket bound to my I.P. address and port of choice, and invokes a function to serve/listen indefinitely.

Aside: I know by looking at the Python DOCs https://docs.python.org/2/library/simplehttpserver.html that SimpleHTTPServer.SimpleHTTPRequestHandler has a method called do_GET, which I assume gets overridden by the do_GET in my ServerHandler class?

Question: What is going on under the hood relating to do_GET and do_POST? Is it the case that once we have this server listening for HTTP "activity" directed towards a specific IP:PORT that it automatically know if an incoming signal is a GET or POST and as soon as one is encountered the server calls my do_GET or do_POST functions?


Solution

  • When you call SocketServer.TCPServer, you assign your Handler class as the class to receive incoming requests.

    All that the SimpleHTTPServer module has helped you with is providing the basic HTTP functionality, but you could write all of that yourself.

    So, as you say, when you define Handler, you are inheriting all the methods from the SimpleHTTPRequestHandler class, but then overriding two of the pre-defined methods: do_GET and do_POST. You could also override any other methods in the class.

    However, these do_* methods would never be called if it wasn't for the handle method defined in the SimpleHTTPRequestHandler as it is this function which is called by the socketserver module.

    So if you were to just inherit the socketserver.BaseRequestHandler, you would loose all functionality as this class' handle() method does nothing:

    class socketserver.BaseRequestHandler

    ...

    handle()

    This function must do all the work required to service a request. The default implementation does nothing. Several instance attributes are available to it; the request is available as self.request; the client address as self.client_address; and the server instance as self.server, in case it needs access to per-server information.

    ...

    So, by importing the SimpleHTTPRequestHandler from the SimpleHTTPServer module, you immediately get the basic functionality for a HTTP server.

    All this functionality is documented here, with an important bit on its handle method:

    class http.server.BaseHTTPRequestHandler(request, client_address, server)

    ...

    handle()

    Calls handle_one_request() once (or, if persistent connections are enabled, multiple times) to handle incoming HTTP requests. You should never need to override it; instead, implement appropriate do_*() methods.

    handle_one_request()

    This method will parse and dispatch the request to the appropriate do_*() method. You should never need to override it.

    ...

    So finally, after breaking down how the socketserver.TCPServer will call the handle() method for whatever class you pass it, we see how the SimpleHTTPRequestHandler implements this as passing the request onto the appropriate do_GET, do_POST or whatever method depending on the headers of the request.


    If you want to see how you could implement this yourself, take a look at the source code, either in /usr/lib/pythonX.Y/http/server.py or on GitHub.

    We can see there what their that SimpleHTTPServer inherits BaseHTTPServer which is where the handle() and handle_one_request() methods are defined:

    So, as the docs describe, handle just passes requests to handle_one_request until the connection closes:

    def handle(self):
        """Handle multiple requests if necessary."""
        self.close_connection = True
    
        self.handle_one_request()
        while not self.close_connection:
            self.handle_one_request()
    

    and the handle_one_request is where the do_* methods get called:

    def handle_one_request(self):
        """Handle a single HTTP request.
        You normally don't need to override this method; see the class
        __doc__ string for information on how to handle specific HTTP
        commands such as GET and POST.
        """
        try:
            self.raw_requestline = self.rfile.readline(65537)
            if len(self.raw_requestline) > 65536:
                self.requestline = ''
                self.request_version = ''
                self.command = ''
                self.send_error(HTTPStatus.REQUEST_URI_TOO_LONG)
                return
            if not self.raw_requestline:
                self.close_connection = True
                return
            if not self.parse_request():
                # An error code has been sent, just exit
                return
            mname = 'do_' + self.command   ## the name of the method is created
            if not hasattr(self, mname):   ## checking that we have that method defined
                self.send_error(
                    HTTPStatus.NOT_IMPLEMENTED,
                    "Unsupported method (%r)" % self.command)
                return
            method = getattr(self, mname)  ## getting that method
            method()                       ## finally calling it
            self.wfile.flush() #actually send the response if not already done.
        except socket.timeout as e:
            #a read or a write timed out.  Discard this connection
            self.log_error("Request timed out: %r", e)
            self.close_connection = True
            return
    

    (note, I double-hashed (##) my comments to seperate them from the original author's)