Search code examples
javageohashing

GeoHash Function not returning correct result


I am trying to write a geohashing function that takes a latitude/longitude pair and returns its base2 (geohash once converted to base32) string. However, this is giving me incorrect results. What is wrong with it?

public static void main(String[] args) {
    float latitude = 45.512794f;
    float longitude = -122.679565f;
    System.out.println(geoHash(latitude, longitude));
}

private static String geoHash(float lat, float lng) {
    float lowLat = -90.0f;
    float highLat = 90.0f;
    float lowLng = -180.0f;
    float highLng = 180.0f;
    return geoHash(lowLng, highLng, lowLat, highLat, lat, lng, "");
}

private static String geoHash(float lowLng, float highLng, float lowLat, float highLat, float lat, float lng, String hash) {
    if (hash.length() == 30)
        return hash;

    float midLat = (lowLat + highLat) / 2;
    float midLng = (lowLng + highLng) / 2;
    if (lng <= midLng && lat <= midLat) {
        return geoHash(lowLng, midLng, lowLat, midLat, lat, lng, hash + "00");
    } else if (lng > midLng && lat <= midLat) {
        return geoHash(midLng, highLng, lowLat, midLat, lat, lng, hash + "01");
    } else if (lng <= midLng && lat > midLat) {
        return geoHash(lowLng, midLng, midLat, highLat, lat, lng, hash + "10");
    } else {
        return geoHash(midLng, highLng, midLat, highLat, lat, lng, hash + "11");
    }
}

I am geting 101001000100000011011010100011 which converts to kh0dl3 base32, and I am expecting 11000001000000011110101110110 which converts to c20fbm.

what i don't understand is the first two pairs of bits are the same in the result that I get from my function (1010), meaning it hit the same quadrant twice. In the actual converted geohash I found online they are two different quadrants (1100).

Edit: Upon further debugging, and with the selected answer's help, I found that I was decoding the base 32 incorrectly (I was using 4 bits, instead of 5). I also should have used the table found at https://en.wikipedia.org/wiki/Geohash. There is also an error in my code, fixed below:

private static String geoHash(float lowLng, float highLng, float lowLat, float highLat, float lat, float lng, String hash) {
    if (hash.length() == 30)
        return hash;

    float midLng = (lowLng + highLng) / 2.0f;
    float midLat = (lowLat + highLat) / 2.0f;
    if (lng <= midLng && lat <= midLat) {
        return geoHash(lowLng, midLng, lowLat, midLat, lat, lng, hash + "00");
    } else if (lng <= midLng && lat > midLat) {
        return geoHash(lowLng, midLng, midLat, highLat, lat, lng, hash + "01");
    } else if (lng > midLng && lat <= midLat) {
        return geoHash(midLng, highLng, lowLat, midLat, lat, lng, hash + "10");
    } else {
        return geoHash(midLng, highLng, midLat, highLat, lat, lng, hash + "11");
    }
}

Solution

  • Wherever you got your expected String from, that source lied. First, your expected String is only 29 characters long, which means 1 character is missing. Also, the first two bits would need to be 01, because longitude is negative but latitude is positive.

    But there is still a bug in your code: If I understand the composition of a geohash correctly, you are switching the bits for longitude and latitude that you append to hash (the second and third if clause in the geoHash(float, float, float, float, float, float, String) method where you process lng and lat).

    Update

    Upon further investigation, it looks like another reason you are getting unexpected results is that, apparently, there is more than one possible conversion between base32 and base2. I tried out a few online decoders/encoders that I could find, and they all gave me the results you mentioned in your question. However, upon reading the Wikipedia page Geohash, it seems that the algorithm used for encoding geohashes from base2 to base32 is different.

    For example, let's examine the geohash you were actually getting (so there is no ambiguity about missing digits). Your method returns 101001000100000011011010100011, which you claim translates to kh0dl3. True, when I enter it here, I also get this result. But let's look a little closer. The first 5 characters are 10100, or, converted to decimal notation, 12 (5 characters from a base2 string correspond to one character in a base32 string, hence we need to take 5 characters at once). Entering 10100 into the page I just linked to yields K, the first character of kh0dl3, just as expected. However, according to the table in the Wikipedia Geohash page, 12 does not translate to k, but to d. So apparently, the base32-base2-conversion algorithm for geohashes is different than the one you used to get your expected result from.