Search code examples
rtravis-cigithub-apihttr

Authenticate at Github via Travis-CI using httr as well as locally (local works, remote doesn't)


I have an Rmd file that uses httr to access the Github-API. Locally, I can authenticate with Github just fine if I run the following in the R console before rendering the Rmd:

myapp <- oauth_app("APP", key = "xyz", secret = "pqr")
github_token <- oauth2.0_token(oauth_endpoints("github"), myapp)

The key and secret were created at Github, and exist in my workspace when I render, so github_token is picked up and I can access the Github-API without hitting the access limit when rendering locally.

Now, the same Rmd is also built automatically at Travis-CI and then deployed to gh-pages when I push the master branch. I have this working w/o authentification but that limits my Githhub-API request limit to 60/hr and I need the higher limit one gets with authentification. So for this I have a personal access token (PAT) also set up in Github; the page where one sets the PAT says "Personal access tokens function like ordinary OAuth access tokens. They can be used instead of a password for Git over HTTPS, or can be used to authenticate to the API over Basic Authentication".

Here is part of my Rmd where I try to detect if the rendering is local or remote and get the appropriate token. However, when this is run at Travis-CI, the token doesn't appear to be recognized, so I don't think I'm using it correctly.

# Figure out the build location, and get the needed token
at_home <- FALSE
at_TCI <- FALSE
token_found <- FALSE
token_OK <- FALSE # not used now/yet

# Check to see if we are at TRAVIS-CI
# This next variable is in the Travis build environment & is a character string
token_value <- Sys.getenv("TRAVIS_CI") 
if (token_value != "") {
  token_found <- TRUE
  at_TCI <- TRUE
}

# Check to see if we are on the local/home machine
# This token is generated interactively via "Web Application Flow",
# and is deposited in the local workspace
# See developer.github.com/apps/building-oauth-apps/authorizing-oauth-apps/#web-application-flow
# This token has classes 'Token2.0', 'Token', 'R6' <Token2.0>
if (!at_TCI) {
  token_found <- exists("github_token")
  if (token_found) {
    token_value <- github_token
    at_home <- TRUE
  }
}

# See where we stand and act accordingly
if (!token_found) {
  message("Could not retrieve token - GET calls will be rate-limited by Github")
  # TEMPORARY: just use a few lines for faster testing & not blasting GH limits
  DF <- DF[1:5,]
}
if (token_found) {
  set_config(config(token = token_value)) # applies to all GET requests below
}

I don't think the set_config call is working correctly when I'm at Travis-CI, because I get an error that seems to come from a GET call that occurs later (it's really hard to troubleshoot on T-CI, but the Rmd works fine locally). Here is a sample GET call that fails remotely after running the snippet above, but works locally:

repoOK[i] <- identical(status_code(GET(DF$repo[i])), 200L)

where DF$repo[i] is a URL.

I'm new to httr and the Github-API, but I've spent a lot of time experimenting with incantations found here on SO, and with the Github documentation, but so far no success with the remote build. Hence I call upon the mercies of the SO community!

EDIT: GH repo with full code.

EDIT 2: No one answered during the bounty period (!). So I will be working on the master branch. This branch has the code that works locally but fails at Travis-CI. Also, this branch has all Python stuff eliminated to avoid other issues and keep things clean. This branch gives the following error on Travis-CI:

Error in getGHdates(DF$repo[i], "commits") : Github access rate exceeded, try again later


Solution

  • The answer appears to be that one cannot use the same authentification method when working locally as you need for remote use at Travis-CI. To make the Rmd render properly at both locations, I had to write more complex code than I had hoped. In particular, for working locally it is sufficient to authenticate as follows.

    First, in the R console run (as above);

    myapp <- oauth_app("APP", key = "xyz", secret = "pqr")
    github_token <- oauth2.0_token(oauth_endpoints("github"), myapp)
    

    Then in the Rmd code one needs:

    # Figure out the build location, and get the needed token
    at_home <- FALSE
    at_TCI <- FALSE
    token_found <- FALSE
    where <- NULL
    
    # Check to see if we are at TRAVIS-CI
    # This token has class character
    token_value <- Sys.getenv("TRAVIS_CI")
    if (token_value != "") {
      token_found <- TRUE
      at_TCI <- TRUE
    }
    
    # Check to see if we are on the local/home machine
    # This token is generated interactively via "Web Application Flow",
    # and is deposited in the local workspace with the name github_token before rendering
    # See developer.github.com/apps/building-oauth-apps/authorizing-oauth-apps/#web-application-flow
    # This token has classes 'Token2.0', 'Token', 'R6' <Token2.0>
    if (!at_TCI) {
      token_found <- exists("github_token")
      if (token_found) {
        token_value <- github_token
        at_home <- TRUE
      }
    }
    
    # See where we stand and act accordingly
    if (!token_found) {
      message("Could not retrieve token - GET calls will be rate-limited by Github")
      # TEMPORARY: just use a few lines for faster testing & not blasting GH limits
      DF <- DF[1:5,]
    }
    if (token_found) {
      if (at_home) set_config(config(token = token_value))
      # This is sufficient for at_home and the GET calls elsewhere have a simple form
      if (at_home) where <- "home"
      if (at_TCI) where <- "TCI"
    }
    
    if (is.null(where)) stop("I'm lost")
    
    # Report for troubleshooting
    # cat("at_home = ", at_home, "\n")
    # cat("at_TCI = ", at_TCI, "\n")
    # cat("token_found = ", token_found, "\n")
    

    With this arrangement, calls to the Github API using GET work fine.

    However, when working remotely at Travis-CI, this approach does not work. For that case, one needs to do something along these lines:

    for (i in 1:ne) {
      if (!is.na(DF$web[i])) {
        if (at_home) access_string <- DF$web[i]
        if (at_TCI) {
          GH <- grepl("github\\.com", DF$web[i])
          if (!GH) access_string <- DF$web[i] # local access
          if (GH) access_string <- paste0(DF$web[i], "?access_token=",
            token_value) # remote access from Travis-CI
        }
        webOK[i] <- identical(status_code(GET(access_string)), 200L)
        webLink[i] <- TRUE
        if (webLink[i] != webOK[i]) badWeb[i] <- TRUE
      }
    }
    

    I found the advice to embed the token in the GET call here.

    If you've read this far, good luck on your own project! Full code is in this GH repo.