For a project i am trying to access data produced from an online model/algorithm. The owners provide the python code to access this data. This is as follows:
import requests
import random
import os
import pandas as pd
from rdkit import Chem
upload_url=r'site name'
def predict_pka(smi):
param={"Smiles" : ("tmg", smi)}
headers={'token':'tokenstring'}
response=requests.post(url=upload_url, files=param, headers=headers)
jsonbool=int(response.headers['ifjson'])
if jsonbool==1:
res_json=response.json()
if res_json['status'] == 200:
pka_datas = res_json['gen_datas']
return pka_datas
else:
raise RuntimeError("Error for prediction")
else:
raise RuntimeError("Error for prediction")
if __name__=="__main__":
smi = "CCOP(=S)(OCC)OC1=NC(=C(C=C1Cl)Cl)Cl"
data_pka = predict_pka(smi)
print(data_pka)
I took out the actual url and the token, since i don't know if its responsible to share those. This code works from R studio and using python, i can get the data.
However i want to get the data using an R script, so i tried translating the code to R:
getPKA = function(){
upload_url="site name"
param = rjson::toJSON(list('Smiles' = c("tmg", "CCOP(=S)(OCC)OC1=NC(=C(C=C1Cl)Cl)Cl")))
response = httr::POST(url = upload_url,
httr::add_headers(
'token' = 'tokenstring'
),
encode = c("multipart", "form", "json", "raw"),
httr::content_type_json(),
body = param,
httr::verbose()
)
return(response)
}
When i run the R code, i get the following output:
-> POST /modules/upload0/ HTTP/1.1
-> Host: host
-> User-Agent: libcurl/7.84.0 r-curl/5.0.0 httr/1.4.5
-> Accept-Encoding: deflate, gzip
-> Accept: application/json, text/xml, application/xml, */*
-> token: tokenstring
-> Content-Type: application/json
-> Content-Length: 56
->
>> {"Smiles":["tmg","CCOP(=S)(OCC)OC1=NC(=C(C=C1Cl)Cl)Cl"]}
<- HTTP/1.0 500 INTERNAL SERVER ERROR
<- Content-Type: text/html; charset=utf-8
<- X-XSS-Protection: 0
<- Connection: close
<- Server: Werkzeug/1.0.1 Python/3.6.12
<- Date: Sat, 13 May 2023 10:28:53 GMT
<-
Once again i redacted the token and the host.
I got to this R code by reading up a bit on both the python requests package and the httr package, however i don't know much about API connections or web connections in general and i only need it for this data.
I think it might have to do with the param
format.
When i print param
in the python code i get this:
{'Smiles': ('tmg', 'CCOP(=S)(OCC)OC1=NC(=C(C=C1Cl)Cl)Cl')}
while if i print it in the R code i get this: {"Smiles":["tmg","CCOP(=S)(OCC)OC1=NC(=C(C=C1Cl)Cl)Cl"]}
.
Normal brackets are used in the python version and square brackets are used in the R version.
I don't know if this is actually the problem or how to change this. I tried using different list types (vector, list) and i tried directly using the line {'Smiles': ('tmg', 'CCOP(=S)(OCC)OC1=NC(=C(C=C1Cl)Cl)Cl')}
from the python code as a character string in the body, but i get the same error.
If i print the response itself (or the content using content(response)
), it says: AttributeError: 'NoneType' object has no attribute 'filename'
among the html file contents.
I do see a lot of questions on stackoverflow with similar questions, and i tried copying their code and molding it for my needs, but it does not really change anything.
thank you for your time!
That requests
call POSTs a multipart-encoded file and request looks something like this:
POST / HTTP/1.1
Host: localhost:1234
User-Agent: python-requests/2.28.2
Accept-Encoding: gzip, deflate, br
Accept: */*
Connection: keep-alive
token: tokenstring
Content-Length: 176
Content-Type: multipart/form-data; boundary=2202e29dea10e9ab00dcf55c67ed1817
--2202e29dea10e9ab00dcf55c67ed1817
Content-Disposition: form-data; name="Smiles"; filename="tmg"
CCOP(=S)(OCC)OC1=NC(=C(C=C1Cl)Cl)Cl
--2202e29dea10e9ab00dcf55c67ed1817--
With httr2
/ curl
, this should be close enough:
library(httr2)
getPKA <- function(smi){
upload_url <- "site name"
# Seems that we need to read actual file from disk to include filename="tmg"
tmg_path <- file.path(tempdir(),"tmg")
write(smi,tmg_path)
tmg_form_data <- curl::form_file(tmg_path, type = "text/plain")
request(upload_url) %>%
req_headers(token = "tokenstring") %>%
req_body_multipart(Smiles = tmg_form_data) %>%
req_timeout(5) %>%
req_perform(verbosity = 2) %>%
resp_body_json()
}
getPKA("CCOP(=S)(OCC)OC1=NC(=C(C=C1Cl)Cl)Cl")
#> -> POST / HTTP/1.1
#> -> Host: localhost:1234
#> -> User-Agent: httr2/0.2.2 r-curl/5.0.0 libcurl/7.84.0
#> -> Accept: */*
#> -> Accept-Encoding: deflate, gzip
#> -> token: tokenstring
#> -> Content-Length: 220
#> -> Content-Type: multipart/form-data; boundary=------------------------744c5a08426eb63b
#> ->
#> >> --------------------------744c5a08426eb63b
#> >> Content-Disposition: form-data; name="Smiles"; filename="tmg"
#> >> Content-Type: text/plain
#> >>
#> >> CCOP(=S)(OCC)OC1=NC(=C(C=C1Cl)Cl)Cl
#> >>
#> >> --------------------------744c5a08426eb63b--
#> Error:
#> ! Timeout was reached: [localhost:1234] Operation timed out after 5001 milliseconds with 0 bytes received
Created on 2023-05-13 with reprex v2.0.2