Search code examples
rapifor-loophttp-posthttr

Running a POST request to get a Service Ticket inside a for loop


I'm working with the NIH/NLM REST API and attempting to programmatically pull lots of data at once. I've never worked with an API that validates with Service Tickets (TGT and ST) instead of OAUTH, that need to be refreshed for every GET request you make, so I'm not sure if I"m even going about this the right way. Any help much appreciated.

Here's the code I currently have:

library(httr)
library(jsonlite)
library(xml2)

UTS_API_KEY <- 'MY API KEY'

# post to the CAS endpoint
response <- POST('https://utslogin.nlm.nih.gov/cas/v1/api-key', encode='form', body=list(apikey = 'MY API KEY'))

# print out the status_code and content_type
status_code(response)
headers(response)$`content-type`

doc <- content(response)
action_uri <- xml_text(xml_find_first(doc, '//form/@action'))
action_uri

# Service Ticket
response <- POST(action_uri, encode='form', body=list(service = 'http://umlsks.nlm.nih.gov'))
ticket <- content(response, 'text')
ticket #this is the ST I need for every GET request I make


# build search_uri using the paste function for string concatenation
version <- 'current'
search_uri <- paste('https://uts-ws.nlm.nih.gov/rest/search/', version, sep='')

# pass the the query params into httr GET to get the response 
query_string <- 'diabetic foot'
response <- GET(search_uri, query=list(ticket=ticket, string=query_string))

## print out some of the results
search_uri
status_code(response)
headers(response)$`content-type`


search_results_auto_parsed <- content(response)
search_results_auto_parsed


class(search_results_auto_parsed$result$results)

search_results_data_frame <- fromJSON(content(response,'text'))
search_results_data_frame

This code works perfectly for just a handful of GET requests, however, I'm attempting to pull 300-something medical terms. For example, in query string, I'd like to loop through an array of strings (e.g., "diabetes", "blood pressure", "cardiovascular care", "EMT", etc.). I'd need to make the POST request and pass the ST into the GET parameter for every string in the array.

I've played around with this code:

for (i in 1:length(Entity_Subset$Entities)){
  ent = Entity_Subset$Entities[i] #Entities represents my df of strings 
  url <- paste(' https://uts-ws.nlm.nih.gov/rest/search/current?string=',
               ent,'&ticket=', sep = "")
  print(url)
  }

But haven't had much luck piecemealing together the POST and GET requests after putting the strings into the (GET) HTTPS request.

Sidebar: I also attempted writing some pre-scripts in Postman, but oddly the Service Ticket doesn't return as JSON (no key-value pair to grab and pass). Just plain text.

Thank you for any advice you can provide!


Solution

  • I think you can simply wrap both POST and GET requests in a function. Then, lapply that function to a list of characters.

    library(httr)
    library(jsonlite)
    library(xml2)
    
    fetch_data <- function(query_string = 'diabetic foot', UTS_API_KEY = 'MY API KEY', version = 'current') {
      response <- POST('https://utslogin.nlm.nih.gov/cas/v1/api-key', encode='form', body=list(apikey = UTS_API_KEY))
      
      # print out the status_code and content_type
      message(status_code(response), "\n", headers(response)$`content-type`)
      action_uri <- xml_text(xml_find_first(content(response), '//form/@action')); message(action_uri)
      
      # Service Ticket
      response <- POST(action_uri, encode = 'form', body=list(service = 'http://umlsks.nlm.nih.gov'))
      ticket <- content(response, 'text'); message(ticket)
      
      # build search_uri using the paste function for string concatenation
      search_uri <- paste0('https://uts-ws.nlm.nih.gov/rest/search/', version)
      # pass the the query params into httr GET to get the response 
      response <- GET(search_uri, query=list(ticket=ticket, string=query_string))
      ## print out some of the results
      message(search_uri, "\n", status_code(response), "\n", headers(response)$`content-type`)
      fromJSON(content(response, 'text'))
    }
    
    # if you have a list of query strings, then
    lapply(Entity_Subset$Entities, fetch_data, UTS_API_KEY = "blah blah blah")
    
    # The `lapply` above is logically equivalent to
    result <- vector("list", length(Entity_Subset$Entities))
    for (x in Entity_Subset$Entities) {
      result[[x]] <- fetch_data(x, "blah blah blah")
    }