Search code examples
rapihttp-status-code-403httr

How to get data from an api in r?


I am new to api stuff and have come across a small piece of code in python to retrieve data which I would like to replicate in r:

python code:

import requests
import json
from datetime import date
import time
import smtplib, ssl

#API URL
url = 'http://cdn-api.co-vin.in/api/v2/admin/location/states'
headers = {'accept': 'application/json','Accept-Language' : 'hi_IN','User-Agent': 'Mozilla/4.0'}
result = requests.get(url, headers=headers)
#Print state ID
print(result.content.decode())

Results:

{"states":[{"state_id":1,"state_name":"Andaman and Nicobar Islands"},{"state_id":2,"state_name":"Andhra Pradesh"},{"state_id":3,"state_name":"Arunachal Pradesh"},{"state_id":4,"state_name":"Assam"},{"state_id":5,"state_name":"Bihar"},{"state_id":6,"state_name":"Chandigarh"},{"state_id":7,"state_name":"Chhattisgarh"},{"state_id":8,"state_name":"Dadra and Nagar Haveli"},{"state_id":37,"state_name":"Daman and Diu"},{"state_id":9,"state_name":"Delhi"},{"state_id":10,"state_name":"Goa"},{"state_id":11,"state_name":"Gujarat"},{"state_id":12,"state_name":"Haryana"},{"state_id":13,"state_name":"Himachal Pradesh"},{"state_id":14,"state_name":"Jammu and Bengal"}],"ttl":24}

Info about API:

url: 'http://cdn-api.co-vin.in/api/v2/admin/location/states'

from: https://apisetu.gov.in/public/marketplace/api/cowin#/Metadata%20APIs/states

enter image description here

from: https://github.com/cowinapi/developer.cowin/issues/339

(Note: cowin API are restricted to be accessed only in INDIA. So I guess many of you will not be able to use it. But it will be still helpful if you could suggest some code changes.)

R

I have googled abit and tried below pieces of code but none of them worked so far:

library(tidyverse)
library(rjson)
library(jsonlite)
library(RCurl)
library(httr)
states_url = 'http://cdn-api.co-vin.in/api/v2/admin/location/states'

headers = c({'accept' = 'application/json'},
            {'Accept-Language' = 'hi_IN'},
            {'User-Agent' = 'Mozilla/4.0'})

url(states_url, headers = headers)
GET(states_url)$content
GET(states_url)$headers

Update

I have tried this and it didnt give error but not sure what to do next:

states_url = 'http://cdn-api.co-vin.in/api/v2/admin/location/states'

headers = c('accept' = {'application/json'},
            'Accept-Language' = {'hi_IN'},
            'User-Agent' = {'Mozilla/4.0'})

url(states_url, headers = headers)

A connection with
description "http://cdn-api.co-vin.in/api/v2/admin/location/states" class "url-wininet"
mode "r"
text "text"
opened "closed"
can read "yes"
can write "no"

GET(states_url, header = headers)$content

1 3c 21 44 4f 43 54 59 50 45 20 48 54 4d 4c 20 50 55 42 4c [20] 49 43 20 22 2d 2f 2f 57 33 43 2f 2f 44 54 44 20 48 54 4d [39] 4c 20 34 2e 30 31 20 54 72 61 6e 73 69 74 69 6f 6e 61 6c [58] 2f 2f 45 4e 22 20 22 68 74 74 70 3a 2f 2f 77 77 77 2e 77 [77] 33 2e 6f 72 67 2f 54 52 2f 68 74 6d 6c 34 2f 6c 6f 6f 73 [96] 65 2e 64 74 64 22 3e 0a 3c 48 54 4d 4c 3e 3c 48 45 41 44 [115] 3e 3c 4d 45 54 41 20 48 54 54 50 2d 45 51 55 49 56 3d 22 [134] 43 6f 6e 74 65 6e 74 2d 54 79 70 65 22 20 43 4f 4e 54 45 [153] 4e 54 3d 22 74 65 78 74 2f 68 74 6d 6c 3b 20 63 68 61 7

str(GET(states_url))
List of 10
 $ url        : chr "http://cdn-api.co-vin.in/api/v2/admin/location/states"
 $ status_code: int 403
 $ headers    :List of 9
  ..$ server        : chr "CloudFront"
  ..$ date          : chr "Tue, 25 May 2021 12:39:02 GMT"
  ..$ content-type  : chr "text/html"
  ..$ content-length: chr "919"
  ..$ connection    : chr "keep-alive"
  ..$ x-cache       : chr "Error from cloudfront"
  ..$ via           : chr "1.1 85ad220378d99bdabeb6c46016f1cf16.cloudfront.net (CloudFront)"
  ..$ x-amz-cf-pop  : chr "BOM51-C1"
  ..$ x-amz-cf-id   : chr "eeJq5ZtSJHZkLGoJZBTUL2xL5PcU2gjesnY7Qmg_kMnxZxZ1JUHPWA=="
  ..- attr(*, "class")= chr [1:2] "insensitive" "list"
 $ all_headers:List of 1
  ..$ :List of 3
  .. ..$ status : int 403
  .. ..$ version: chr "HTTP/1.1"
  .. ..$ headers:List of 9
  .. .. ..$ server        : chr "CloudFront"
  .. .. ..$ date          : chr "Tue, 25 May 2021 12:39:02 GMT"
  .. .. ..$ content-type  : chr "text/html"
  .. .. ..$ content-length: chr "919"
  .. .. ..$ connection    : chr "keep-alive"
  .. .. ..$ x-cache       : chr "Error from cloudfront"
  .. .. ..$ via           : chr "1.1 85ad220378d99bdabeb6c46016f1cf16.cloudfront.net (CloudFront)"
  .. .. ..$ x-amz-cf-pop  : chr "BOM51-C1"
  .. .. ..$ x-amz-cf-id   : chr "eeJq5ZtSJHZkLGoJZBTUL2xL5PcU2gjesnY7Qmg_kMnxZxZ1JUHPWA=="
  .. .. ..- attr(*, "class")= chr [1:2] "insensitive" "list"
 $ cookies    :'data.frame':    0 obs. of  7 variables:
  ..$ domain    : logi(0) 
  ..$ flag      : logi(0) 
  ..$ path      : logi(0) 
  ..$ secure    : logi(0) 
  ..$ expiration: 'POSIXct' num(0) 
  ..$ name      : logi(0) 
  ..$ value     : logi(0) 
 $ content    : raw [1:919] 3c 21 44 4f ...
 $ date       : POSIXct[1:1], format: "2021-05-25 12:39:02"
 $ times      : Named num [1:6] 0 0.242 0.283 0.284 0.321 ...
  ..- attr(*, "names")= chr [1:6] "redirect" "namelookup" "connect" "pretransfer" ...
 $ request    :List of 7
  ..$ method    : chr "GET"
  ..$ url       : chr "http://cdn-api.co-vin.in/api/v2/admin/location/states"
  ..$ headers   : Named chr "application/json, text/xml, application/xml, */*"
  .. ..- attr(*, "names")= chr "Accept"
  ..$ fields    : NULL
  ..$ options   :List of 2
  .. ..$ useragent: chr "libcurl/7.64.1 r-curl/4.3.1 httr/1.4.2"
  .. ..$ httpget  : logi TRUE
  ..$ auth_token: NULL
  ..$ output    : list()
  .. ..- attr(*, "class")= chr [1:2] "write_memory" "write_function"
  ..- attr(*, "class")= chr "request"
 $ handle     :Class 'curl_handle' <externalptr> 
 - attr(*, "class")= chr "response"
Show in New Window
http_status(GET(states_url))
$category
[1] "Client error"

$reason
[1] "Forbidden"

$message
[1] "Client error: (403) Forbidden"
stringi::stri_enc_detect(GET(states_url, header = headers)$content)

[[1]]
     Encoding Language Confidence
1  ISO-8859-1       en       0.54
2  ISO-8859-2       ro       0.26
3       UTF-8                0.15
4    UTF-16BE                0.10
5    UTF-16LE                0.10
6   Shift_JIS       ja       0.10
7     GB18030       zh       0.10
8      EUC-JP       ja       0.10
9      EUC-KR       ko       0.10
10       Big5       zh       0.10
11 ISO-8859-9       tr       0.06
12 IBM424_rtl       he       0.02
13 IBM424_ltr       he       0.01

content(GET(states_url, header = headers), encoding = "UTF-8")

{html_document}
<html>
[1] <head>\n<meta http-equiv="Content-Type" content="text/htm ...
[2] <body>\n<h1>403 ERROR</h1>\n<h2>The request could not be  ...
content(GET(states_url, header = headers), encoding = "ISO-8859-1")

{html_document}
<html>
[1] <head>\n<meta http-equiv="Content-Type" content="text/htm ...
[2] <body>\n<h1>403 ERROR</h1>\n<h2>The request could not be  ...

python code image:

enter image description here


Solution

  • You make headers but never include them in your call to GET. Use them there.

    GET(states_url, add_headers(headers))