Search code examples
githubgraphqlpipelinegithub-issuesincremental-load

How to load data from github graphql using since like rest API


I have written a pipeline to load issues from GitHub to big query; I want to make it incremental, for example, load only the data from the last run to the present run; I tweaked the pipeline code to pass since arg, but I don't know if the graphql supports it or not. Or is there a different way to load data incrementally using graphql? We would have to surely pass the variable to query to retrieve only the data needed.

here is how query starts

ISSUES_QUERY = """
query($owner: String!, $name: String!, $issues_per_page: Int!, $first_reactions: Int!, $first_comments: Int!, $page_after: String) {
  repository(owner: $owner, name: $name) {
    %s(first: $issues_per_page, orderBy: {field: CREATED_AT, direction: DESC}, after: $page_after) {
      totalCount
      pageInfo {
        endCursor
        startCursor
      }
      nodes {
        id

If anyone can suggest how to pass since parameter or if graphql supports since parameter of not it'd be great. Thanks

ISSUES_QUERY = """
query($owner: String!, $name: String!, $issues_per_page: Int!, $first_reactions: Int!, $first_comments: Int!, $page_after: String, since: String) {
  repository(owner: $owner, name: $name) {
    %s(first: $issues_per_page, orderBy: {field: CREATED_AT, direction: DESC}, after: $page_after, since: $since) {
      totalCount
      pageInfo {
        endCursor
        startCursor
      }
      nodes {
        id
        number

I Got this error

In processing pipe issues: extraction of resource issues in generator _get_reactions_data caused an exception: {'errors': [{'path': ['query', 'repository', 'issues', 'since'], 'extensions': {'code': 'argumentNotAccepted', 'name': 'issues', 'typeName': 'Field', 'argumentName': 'since'}, 'locations': [{'line': 4, 'column': 104}], 'message': "Field 'issues' doesn't accept argument 'since'"}, {'path': ['query'], 'extensions': {'code': 'variableNotUsed', 'variableName': 'since'}, 'locations': [{'line': 2, 'column': 1}], 'message': 'Variable $since is declared by anonymous query but not used'}]}


Solution

  • I suggest you page backwards in time until you get an issue whose createdDate is earlier than your since criteria. There is no since filter in the GitHub issueConnection type.

    Please not also that issues can be updated at any time so this approach won't work for catching updates to issues.