I am creating an appEngine application in python that will need to perform efficient geospatial queries on datastore data. An example use case would be, I need to find the first 20 posts within a 10 mile radius of the current user. Having done some research into my options, I have found that currently what seems like the 2 best approaches for achieving this type of functionality would be:
It seems from a high level perspective that indexing geohashes and performing queries on them directly would be less costly and much faster than having to create and delete a document for every geospatial query, however i've also read that geohashing can be very inaccurate along the equator or along 'faultlines' created by the hashing algorithm. I've seen very few posts contrasting the best methods in detail, and I think stack is a good place to have this conversation, so my questions are as follows:
Thanks in advance.
Geohashing does not have to be inaccurate at all. It's all in the implementation details. What I mean is you can check the neighbouring geocells as well to handle border-cases, and make sure that includes neighbours on the other side of the equator.
If your use case is finding other entities within a radius as you suggest, I would definitely recommend using the Search API. They have a distance function tailored for that use.
Search API queries are more expensive than Datastore queries yes, but if you weigh in the computation time to do these calculations in your instance and probably iterating through all entities for each geohash to make sure the distance is actually less than the desired radius, then I would say Search API is the winner. And don't forget about the implementation time.