Search code examples
pythonpython-3.xctypeslibcgetaddrinfo

Calling getaddrinfo directly from Python: ai_addr is null pointer


I'm trying to call getaddrinfo from Python, through ctypes / libc, on Mac OS, in order to find the IP address of a domain.

The call appears to succeed: no error code is returned, and ai_addrlen is set to 28, which I understand is the appropriate length for an IPv6 address. However, ai_addr appears to be a null pointer, and I'm not sure how to begin to debug it.

How can I find the IP address of a domain using libc.getaddrinfo ?

from ctypes import (
    byref,
    c_char, c_char_p, c_int, c_size_t, c_void_p,
    CDLL,
    POINTER,
    pointer,
    Structure,
)

libc = CDLL(None)

class c_addrinfo(Structure):
    pass

c_addrinfo._fields_ = [
    ('ai_flags', c_int),
    ('ai_family', c_int),
    ('ai_socktype', c_int),
    ('ai_protocol', c_int),
    ('ai_addrlen', c_size_t),
    ('ai_addr', c_void_p),
    ('ai_canonname', c_char_p),
    ('ai_next', POINTER(c_addrinfo)),
]

c_addrinfo_p = POINTER(c_addrinfo)
result = c_addrinfo_p()
error = libc.getaddrinfo(
    c_char_p(b'www.google.com'),
    None,
    None,
    byref(result),
)

print(error)                          # 0
print(result.contents.ai_canonname)   # b'\x1c\x1e
print(result.contents.ai_addrlen)     # 28
print(bool(result.contents.ai_addr))  # False === null pointer

libc.freeaddrinfo(result)

Solution

  • According to the linux man page for getaddrinfo the addrinfo struct which results form getaddrinfo are stored is defined as

    struct addrinfo {
        int              ai_flags;
        int              ai_family;
        int              ai_socktype;
        int              ai_protocol;
        socklen_t        ai_addrlen;
        struct sockaddr *ai_addr;
        char            *ai_canonname;
        struct addrinfo *ai_next;
    };
    

    and according to the FreeBSD man page for getaddrinfo (or one of Apple's man pages for getaddrinfo which is similar), its addrinfo looks the same, assuming all the types match up.

    struct addrinfo {
         int ai_flags;             /* input flags */
         int ai_family;            /* address family for socket */
         int ai_socktype;          /* socket type */
         int ai_protocol;          /* protocol for socket */
         socklen_t ai_addrlen;     /* length of socket-address */
         struct sockaddr *ai_addr; /* socket-address for socket */
         char *ai_canonname;       /* canonical name for service location */
         struct addrinfo *ai_next; /* pointer to next in list */
    };
    

    However looking in the FreeBSD source (or one of the open source Apple projects which is similar), we see a subtly different definition:

    struct addrinfo {
        int ai_flags;             /* AI_PASSIVE, AI_CANONNAME, AI_NUMERICHOST */
        int ai_family;            /* AF_xxx */
        int ai_socktype;          /* SOCK_xxx */
        int ai_protocol;          /* 0 or IPPROTO_xxx for IPv4 and IPv6 */
        socklen_t ai_addrlen;     /* length of ai_addr */
        char *ai_canonname;       /* canonical name for hostname */
        struct sockaddr *ai_addr; /* binary address */
        struct addrinfo *ai_next; /* next structure in linked list */
    };
    

    It's very easy to miss, but ai_canonname and ai_addr are the other way around to how they are documented. This means that the Python ctypes definition, for Mac(/similar) should be

    class c_addrinfo(Structure):
        pass
    
    c_addrinfo._fields_ = [
        ('ai_flags', c_int),
        ('ai_family', c_int),
        ('ai_socktype', c_int),
        ('ai_protocol', c_int),
        ('ai_addrlen', c_size_t),
        ('ai_canonname', c_char_p),
        ('ai_addr', c_void_p),
        ('ai_next', POINTER(c_addrinfo)),
    ]
    

    or one that works on both Mac and Linux (and with no comment on other platforms)

    import platform
    
    c_addrinfo._fields_ = [
        ('ai_flags', c_int),
        ('ai_family', c_int),
        ('ai_socktype', c_int),
        ('ai_protocol', c_int),
        ('ai_addrlen', c_size_t),
    ] + ([
        ('ai_canonname', c_char_p),
        ('ai_addr', c_void_p),
    ] if platform.system() == 'Darwin' else [
        ('ai_addr', c_void_p),
        ('ai_canonname', c_char_p),
    ]) + [
        ('ai_next', POINTER(c_addrinfo)),
    ]
    

    And with these versions, on Mac, the pointer ai_addr is no longer null. You can also see an early/experimental version that parses the addresses themselves that works in both Mac and Linux.

    Edit: it looks like the documentation issue has already been reported to FreeBSD