Search code examples
restcollectionspaginationhateoas

REST pagination content duplicates


When creating REST application which will return a collection of items (topic with collection of posts) with sorting from new to old ones.

If there will be HATEOAS principles performed and all content will be chunked on pages client will get a current page id, offset, data limits and links to first, current and next page for example.

There is no problem to get data from next page, but if somebody has been added content while client is reading current page - data will be pushed on the start of collection and last item of current page will be moved to the next page.

If you will just skip posts which already has been loaded before, you will get lower amount of items on the next page. There is a way to get a count of pushed items in start of list and increment offset.

What is a best practices for this?


Solution

  • Not using offsets indexes, but instead skip tokens that indicate the first value not to include (or first value to include) is a good technique provided the value can be unique for every item in your result set and is an orderable field based on the current sort. But it's not flawless. Usually this just doesn't matter.

    If it really does matter you have to put IDs of everything that's in the first page in the call to 2nd page, and again and again. HATEOAS helps you do stuff like this...but it can get very messy and still things can pop up on page 1 given the current sorting when you make a request for page 5...what do you do with that?

    Another trick to avoid dupes in a UI is to use the self or canonical link relationships to uniquely identify resources in a page and compare those to existing resources in the UI. Updating the UI with the latest matching resources is usually a simple task. This of course puts some burden on the client.

    There is not a 1 size fits all solution to this problem. You have to design for the UX you intend to fulfill.