Search code examples
python-3.xctypescpythonmonkeypatchingpyopenssl

Retrieving address of native base class with ctypes


I want to be able to pass a certificate to Python's ssl library without requiring a temporary file. It seems that the Python ssl module cannot do that.

To work around this problem I want to retrieve the underlying SSL_CTX struct stored in the ssl._ssl._SSLContext class from the native _ssl module. Using ctypes I could then manually call the respective SSL_CTX_* functions from libssl with that context. How to do that in C is shown here and I would do the same thing via ctypes.

Unfortunately, I'm stuck at the point where I managed to hook into the load_verify_locations function from ssl._ssl._SSLContext but seem to be unable to get the right memory address of the instance of the ssl._ssl._SSLContext struct. All the load_verify_locations function is seeing is the parent ssl.SSLContext object.

My question is, how do I get from an instance of a ssl.SSLContext object to the memory of the native base class ssl._ssl._SSLContext? If I would have that, I could easily access its ctx member.

Here is my code so far. Credits for how to monkeypatch a native Python module go to the forbidden fruit project by Lincoln Clarete

Py_ssize_t = hasattr(ctypes.pythonapi, 'Py_InitModule4_64') and ctypes.c_int64 or ctypes.c_int

class PyObject(ctypes.Structure):
    pass

PyObject._fields_ = [
    ('ob_refcnt', Py_ssize_t),
    ('ob_type', ctypes.POINTER(PyObject)),
]

class SlotsProxy(PyObject):
    _fields_ = [('dict', ctypes.POINTER(PyObject))]

class PySSLContext(ctypes.Structure):
    pass

PySSLContext._fields_ = [
        ('ob_refcnt', Py_ssize_t),
        ('ob_type', ctypes.POINTER(PySSLContext)),
        ('ctx', ctypes.c_void_p),
        ]

name = ssl._ssl._SSLContext.__name__
target = ssl._ssl._SSLContext.__dict__
proxy_dict = SlotsProxy.from_address(id(target))
namespace = {}
ctypes.pythonapi.PyDict_SetItem(
        ctypes.py_object(namespace),
        ctypes.py_object(name),
        proxy_dict.dict,
)
patchable = namespace[name]

old_value = patchable["load_verify_locations"]

libssl = ctypes.cdll.LoadLibrary("libssl.so.1.0.0")
libssl.SSL_CTX_set_verify.argtypes = (ctypes.c_void_p, ctypes.c_int, ctypes.c_void_p)
libssl.SSL_CTX_get_verify_mode.argtypes = (ctypes.c_void_p,)

def load_verify_locations(self, cafile, capath, cadata):
    print(self)
    print(self.verify_mode)
    addr = PySSLContext.from_address(id(self)).ctx
    libssl.SSL_CTX_set_verify(addr, 1337, None)
    print(libssl.SSL_CTX_get_verify_mode(addr))
    print(self.verify_mode)
    return old_value(self, cafile, capath, cadata)

patchable["load_verify_locations"] = load_verify_locations

context = ssl.create_default_context(ssl.Purpose.SERVER_AUTH)

The output is:

<ssl.SSLContext object at 0x7f4b81304ba8>
2
1337
2

This suggests, that whatever I'm changing is not the ssl context that Python knows about but some other random memory location.

To try out the code from above, you have to run a https server. Generate a self-signed SSL certificate using:

$ openssl req -x509 -newkey rsa:4096 -keyout key.pem -out cert.pem -days 365 -subj '/CN=localhost' -nodes

And start a server using the following code:

import http.server, http.server
import ssl
httpd = http.server.HTTPServer(('localhost', 4443), http.server.SimpleHTTPRequestHandler)
httpd.socket = ssl.wrap_socket (httpd.socket, certfile='cert.pem', keyfile='key.pem', server_side=True)
httpd.serve_forever()

And then add the following line to the end of my example code above:

urllib.request.urlopen("https://localhost:4443", context=context)

Solution

  • Actual SSLContext answer forthcoming, the assumption is no longer correct.

    See https://docs.python.org/3/library/ssl.html#ssl.SSLContext.load_verify_locations

    There's a 3rd argument, cadata

    The cadata object, if present, is either an ASCII string of one or more PEM-encoded certificates or a bytes-like object of DER-encoded certificates.

    Apparently that's the case since Python 3.4

    Getting the underlying PyObject context

    This one's easy, ssl.SSLContext inherits from _ssl._SSLContext which in Python data model means that there's just one object at one memory address.

    Therefore, ssl.SSLContext().load_verify_locations(...) will actually call:

    ctx = \
    ssl.SSLContext.__new__(<type ssl.SSLContext>, ...)  # which calls
        self = _ssl._SSLContext.__new__(<type ssl.SSLContext>, ...)  # which calls
            <type ssl.SSLContext>->tp_alloc()  # which understands inheritance
            self->ctx = SSL_CTX_new(...)  # _ssl fields
        self.set_ciphers(...)  # ssl fields
        return self
    
    _ssl._SSLContext.load_verify_locations(ctx, ...)`.
    

    The C implementation will get an object of seemingly wrong type, but that's OK because all the expected fields are there, as it was allocated by generic type->tp_alloc and fields were filled in first by _ssl._SSLContext and then by ssl.SSLContext.

    Here's a demonstration (tedious details omitted):

    # _parent.c
    typedef struct {
      PyObject_HEAD
    } PyParent;
    
    static PyObject* parent_new(PyTypeObject* type, PyObject* args,
                                PyObject* kwargs) {
      PyParent* self = (PyParent*)type->tp_alloc(type, 0);
      printf("Created parent %ld\n", (long)self);
      return (PyObject*)self;
    }
    
    # child.py
    class Child(_parent.Parent):
        def foo(self):
            print(id(self))
    
    c1 = Child()
    print("Created child:", id(c1))
    
    # prints:
    Created parent 139990593076080
    Created child: 139990593076080
    

    Getting the underlying OpenSSL context

    typedef struct {
        PyObject_HEAD
        SSL_CTX *ctx;
        <details skipped>
    } PySSLContext;
    

    Thus, ctx is at a known offset, which is:

    PyObject_HEAD
    This is a macro which expands to the declarations of the fields of the PyObject type; it is used when declaring new types which represent objects without a varying length. The specific fields it expands to depend on the definition of Py_TRACE_REFS. By default, that macro is not defined, and PyObject_HEAD expands to:
    
    Py_ssize_t ob_refcnt;
    PyTypeObject *ob_type;
    
    When Py_TRACE_REFS is defined, it expands to:
    
    PyObject *_ob_next, *_ob_prev;
    Py_ssize_t ob_refcnt;
    PyTypeObject *ob_type;
    

    Thus, in a production (non-debug) build, and taking natural alignment into consideration, PySSLContext becomes:

    struct {
        void*;
        void*;
        SSL_CTX *ctx;
        ...
    }
    

    Therefore:

    _ctx = _ssl._SSLContext(2)
    c_ctx = ctypes.cast(id(_ctx), ctypes.POINTER(ctypes.c_void_p))
    c_ctx[:3]
    [1, 140486908969728, 94916219331584]
    # refcnt,      type,          C ctx
    

    Putting it all together

    import ssl
    import socket
    import ctypes
    import pytest
    
    
    def contact_github(cafile=""):
        ctx = ssl.SSLContext()
        ctx.verify_mode = ssl.VerifyMode.CERT_REQUIRED
    
        # ctx.load_verify_locations(cafile, "empty", None) done via ctypes
        ssl_ctx = ctypes.cast(id(ctx), ctypes.POINTER(ctypes.c_void_p))[2]
        cssl = ctypes.CDLL("/usr/lib/x86_64-linux-gnu/libssl.so.1.1")
        cssl.SSL_CTX_load_verify_locations.argtypes = [ctypes.c_void_p, ctypes.c_char_p, ctypes.c_char_p]
        assert cssl.SSL_CTX_load_verify_locations(ssl_ctx, cafile.encode("utf-8"), b"empty")
    
        s = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
        s.connect(("github.com", 443))
    
        ss = ctx.wrap_socket(s)
        ss.send(b"GET / HTTP/1.0\n\n")
        print(ss.recv(1024))
    
    
    def test_wrong_cert():
        with pytest.raises(ssl.SSLError):
            contact_github(cafile="bad-cert.pem")
    
    
    def test_correct_cert():
        contact_github(cafile="good-cert.pem")