Search code examples
relayjsgraphql-js

Relationship between GraphQL and database when using connection and pagination?


It is very easy to set up pagination with Relay however there's a small detail that is unclear to me.

both of the relevant parts in my code are marked with comments, other code is for additional context.

const postType = new GraphQLObjectType({
  name: 'Post',
  fields: () => ({
      id: globalIdField('Post'),
      title: {
        type: GraphQLString
      },
  }),
  interfaces: [nodeInterface],
})

const userType = new GraphQLObjectType({
  name: 'User',
  fields: () => ({
      id: globalIdField('User'),
      email: {
        type: GraphQLString
      },
      posts: {
        type: postConnection,
        args: connectionArgs,
        resolve: async (user, args) => {
          // getUserPosts() is in next code block -> it gets the data from db
          // I pass args (e.g "first", "after" etc) and user id (to get only user posts)
          const posts = await getUserPosts(args, user._id)
          return connectionFromArray(posts, args)
        }
      },
  }),
  interfaces: [nodeInterface],
})

const {connectionType: postConnection} = 
              connectionDefinitions({name: 'Post', nodeType: postType})

exports.getUserPosts = async (args, userId) => {
    try {
      // using MongoDB and Mongoose but question is relevant with every db
      // .limit() -> how many posts to return
      const posts = await Post.find({author: userId}).limit(args.first).exec()
      return posts
    } catch (err) {
      return err
    }
}

Cause of my confusion:

  • If I pass the first argument and use it in db query to limit returned results, hasNextPage is always false. This is efficient but it breaks hasNextPage (hasPreviousPage if you use last)
  • If I don't pass the first argument and don't use it in db query to limit returned results, hasNextPage is working as expected but it will return all the items I queried (could be thousands)
    • Even if database is on same machine (which isn't the case for bigger apps), this seems very, very, very inefficient and awful. Please prove me that Im wrong!
    • As far as I know, GraphQL doesn't have any server-side caching therefore there wouldn't be any point to return all the results (even if it did, users don't browse 100% content)

What's the logic here?

One solution that comes to my mind is to add +1 to first value in getUserPosts, it will retrieve one excess item and hasNextPage would probably work. But this feels like a hack and there's always excess item returned - it would grow relatively quickly if there are many connections and requests.

Are we expected to hack it like that? Is it expected the return all the results?

Or did I misunderstand the whole relationship between database and GrahpQL / Relay?


What if I used FB DataLoader and Redis? Would that change anything about that logic?


Solution

  • Cause of my confusion

    The utility function connectionFromArray of graphql-relay-js library is NOT the solution to all kinds of pagination needs. We need to adapt our approach based on our preferred pagination models.

    connectionFromArray function derives the values of hasNextPage and hasPrevisousPage from the given array. So, what you observed and mentioned in "Cause of my confusion" is the expected behavior.

    As for your confusion whether to load all data or not, it depends on the problem at hand. Loading all items may make sense in several situations such as:

    • the number of items is small and you can afford the memory required to store those items.
    • the items are frequently requested and you need to cache them for faster access.

    Two common pagination models are numbered pages and infinite scrolling. The GraphQL connection specification is not opinionated about pagination model and allows both of them.

    For numbered pages, you can use an extra field totalPost in your GraphQL type, which can be used to display links to numbered pages on your UI. On the back-end, you can use feature like skip to fetch only the needed items. The field totalPost and the current page number eliminates the dependency on hasNextPage or hasPreviousPage.

    For infinite scrolling, you can use the cursor field, which can be used as the value for after in your query. On the back-end, you can use the value of cursor to retrieve the next items (value of first). See an example of using cursor in Relay documention on GraphQL connection. See this answer about GraphQL connection and cursor. See this and this blog posts, which will help you better understand the idea of cursor.


    What's the logic here?

    Are we expected to hack it like that?

    No, ideally we're not expected to hack and forget about it. That will leave technical debt in the project, which is likely to cause more problems in the long term. You may consider implementing your own function to return a connection object. You will get ideas of how to do that in the implementation of array-connection in graphql-relay-js.

    Is it expected the return all the results?

    Again, depends on the problem.


    What if I used FB DataLoader and Redis? Would that change anything about that logic?

    You can use facebook dataloader library to cache and batch-process your queries. Redis is another option for caching the results. If you load (1) all items using dataloader or store all items in Redis and (2) the items are lightweight, you can easily create an array of all items (following KISS principle). If the items are heavy-weight, creating the array may be an expensive operation.