Search code examples
javagoogle-app-enginecursorgoogle-cloud-datastorejdo

GAE Datastore - How to always get newest/latest results and continue where you left off?


I'm using Google App Engine (GAE), the Datastore (using JDO) and doing it in Java.

I want the following to happen:

Let's say I have the following model: Card, it only has one field and it is the time of creation (i call it 'toc').

I want to always retrieve the 3 newest Card-entities when a GET-request is made. Let's say I currently have 5 Card entities in the Datastore called C1, C2 ... C5. They are inserted in that order, meaning C5 are the newest entity.

I now do my GET-request and will thus get the entities: C5, C4 and C3. After I use the entities on the Client-side, lets say 5 minutes later. I want to retrieve the next 3 newest entities. But during those 5 minutes 2 new Cards have been added to the Datastore, it now looks as following: C1, C2, C3, C4, C5, C6, C7.

I now want to actually retrieve C7, C6 and the Card that is the newest but hasn't already been retrieved, meaning I also want C2. i.e. I want C7, C6 and C2.

I've been experimenting with cursors and can't seem to get this behavior. Currently I'm using a query that sorting the Cards ascending, meaning I will get the oldest Cards first and then when I've gone through them all, the Cursor can actually detect when new entries have been added and retrieve them. But if there has been more then 3 added since the last query it will still take the first 3 (instead of the last 3), meaning I will get the 3 oldest Cards of the "new" Cards. Is the any way to get an "end-cursor" without actually getting to the end? If there is I might be able to live with getting the 3 oldest newest Cards.

Real life application: Imagine 9GAG's Fresh category: Newest entries are always on top (easy). BUT in my application the previous ones will most likely never be seen again, meaning I want to avoid repetition. So it should prioritize newest entries and when there is no "new" Cards any more it will continue where you last left off.

Sorry for wall of text! And thanks for any feedback! :)


Solution

  • Thanks @Bharath for inspiring me/get me to think in other ways. My solution is inspired by your solution. However your solution was a bit too simplistic since I could only retrieve the absolute newest entries while I actually wanted old ones if there was no new ones and be able to return to the previous position.

    My solution works as following:

    If I've got entries: C5, C4, C3, C2, C1. They are sorted by ASC, meaning it's the newest entry first.

    As in the example in OP, I want to retrieve 3 Cards. I then retrieve C5, C4 and C3 and save the timestamp of C5 as "newestTime" and save the timestamp of C3 as "oldestTime" on the client. With these two I can simulate the cursor behavior in the Datastore.

    Next time I retrieve Cards I first try to get Cards with a timestamp greater than "newestTime", which means they are newer. If I get less then 3 results, I try to get the remaining amount of Cards by getting the Cards that has a timestamp that is lower than "oldestTime", which means they are questions that I haven't gotten. So the "oldestTime" works like a cursor pointing at the last entry that you retrieved. When the "newestTime" and "oldestTime" queries both return less then 3 results, this means that there is no more results that are "unseen", which means that we have to restart from the beginning, so we retrieve the remaining Cards from the beginning (and force update the "oldestTime"). Resulting in the exact behavior I described in the OP.

    Some thoughts: My guess is that this solution is more inefficient then using cursors since I always have to filter by timestamps, is that a correct assumption or is that what cursors does, but internally?