Search code examples
winapilanguage-agnosticwinsockendianness

gethostbyname and endianness - how are the bytes returned?


On my (Intel) x86 machine, I've noticed that if I printf the results of gethostbyname for localhost, I get 100007F, even though the MSDN documentation states it should return the IP in network byte order, aka big endian. I searched a bit and found this topic. Based on the answers there, I've deduced the sequence of bytes will be the same no matter the endianness, so, for localhost, I'd have this in memory on both Intel and AMD chips:

7F|00|00|01

Thus, reading that memory with an Intel chip results in a 'reversed' result, while on an AMD CPU, I'd get 0x7F000001. Is that assumption correct? It seems like the only possible explanation, but I want to make sure.

This is the code I'm using:

#define WIN32_LEAN_AND_MEAN

#include <winsock2.h>
#include <ws2tcpip.h>
#include <stdio.h>

// Need to link with Ws2_32.lib
#pragma comment(lib, "ws2_32.lib")

int main(int argc, char **argv)
{

    //-----------------------------------------
    // Declare and initialize variables
    WSADATA wsaData;
    int iResult;

    DWORD dwError;
    int i = 0;

    struct hostent *remoteHost;
    char *host_name;
    struct in_addr addr;

    char **pAlias;

    // Initialize Winsock
    iResult = WSAStartup(MAKEWORD(2, 2), &wsaData);
    if (iResult != 0) {
        printf("WSAStartup failed: %d\n", iResult);
        return 1;
    }

    host_name = "localhost";


    // If the user input is an alpha name for the host, use gethostbyname()
    // If not, get host by addr (assume IPv4)
    if (isalpha(host_name[0])) {        /* host address is a name */
        printf("Calling gethostbyname with %s\n", host_name);
        remoteHost = gethostbyname(host_name);
    }
    else {
        printf("Calling gethostbyaddr with %s\n", host_name);
        addr.s_addr = inet_addr(host_name);
        if (addr.s_addr == INADDR_NONE) {
            printf("The IPv4 address entered must be a legal address\n");
            return 1;
        }
        else
            remoteHost = gethostbyaddr((char *)&addr, 4, AF_INET);
    }

    if (remoteHost == NULL) {
        dwError = WSAGetLastError();
        if (dwError != 0) {
            if (dwError == WSAHOST_NOT_FOUND) {
                printf("Host not found\n");
                return 1;
            }
            else if (dwError == WSANO_DATA) {
                printf("No data record found\n");
                return 1;
            }
            else {
                printf("Function failed with error: %ld\n", dwError);
                return 1;
            }
        }
    }
    else {
        printf("Function returned:\n");
        printf("\tOfficial name: %s\n", remoteHost->h_name);
        for (pAlias = remoteHost->h_aliases; *pAlias != 0; pAlias++) {
            printf("\tAlternate name #%d: %s\n", ++i, *pAlias);
        }
        printf("\tAddress type: ");
        switch (remoteHost->h_addrtype) {
        case AF_INET:
            printf("AF_INET\n");
            break;
        case AF_INET6:
            printf("AF_INET6\n");
            break;
        case AF_NETBIOS:
            printf("AF_NETBIOS\n");
            break;
        default:
            printf(" %d\n", remoteHost->h_addrtype);
            break;
        }
        printf("\tAddress length: %d\n", remoteHost->h_length);

        if (remoteHost->h_addrtype == AF_INET) {
            while (remoteHost->h_addr_list[i] != 0) {
                addr.s_addr = *(u_long *)remoteHost->h_addr_list[i++];
                printf("\tIPv4 Address #%d: %X %s\n", i, addr.s_addr, inet_ntoa(addr));
            }
        }
        else if (remoteHost->h_addrtype == AF_INET6)
            printf("\tRemotehost is an IPv6 address\n");
    }
    getchar();
    return 0;
}

The output:

enter image description here


NOTE: I've had a friend run this on his AMD CPU, and surprisingly, apparently it's 100007F for him as well. Is my previous assumption wrong, or is my friend drunk?


Solution

  • The addresses contained in the hostent structure are in network byte order.

    If you have code that suggests otherwise then you are misinterpreting that code and reaching the wrong conclusion.

    In network byte order, on a little endian host, 127.0.0.1 is 0x0100007f. To see how this works, remember that on a little endian host, the least significant byte is stored first. That's 0x7f. So the bytes appear in this order in memory, 0x7f, 0x00, 0x00, 0x01. And that therefore represents 127.0.0.1.

    Now, those same bytes on a big endian host would represent a different 32 bit value. On a big endian host, the first byte if the most significant, and so 0x7f, 0x00, 0x00, 0x01 would represent the value 0x7f000001.