Search code examples
pythonweb-scrapingcurlpython-requestshttp-headers

Curl Works fine on a site but Getting blocked while using requests


I am trying to make a API call to get reviews from ryviu website.

If I execute the following curl in the terminal, I do get a response from the server.

curl 'https://app.ryviu.io/frontend/client/get-more-reviews?domain=tornado-fans.myshopify.com' \
  -H 'authority: app.ryviu.io' \
  -H 'accept: application/json, text/plain, */*' \
  -H 'accept-language: en-US,en;q=0.9,hi;q=0.8,de;q=0.7,ur;q=0.6,pa;q=0.5,es;q=0.4' \
  -H 'content-type: application/x-www-form-urlencoded' \
  -H 'dnt: 1' \
  -H 'origin: https://www.tornadofans.com' \
  -H 'referer: https://www.tornadofans.com/' \
  -H 'sec-ch-ua: "Chromium";v="110", "Not A(Brand";v="24", "Google Chrome";v="110"' \
  -H 'sec-ch-ua-mobile: ?0' \
  -H 'sec-ch-ua-platform: "macOS"' \
  -H 'sec-fetch-dest: empty' \
  -H 'sec-fetch-mode: cors' \
  -H 'sec-fetch-site: cross-site' \
  -H 'user-agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_7) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/110.0.0.0 Safari/537.36' \
  --data-raw '{"handle":"tornado-24-high-velocity-metal-portable-tilt-blower-drum-fan-yellow-8540-cfm-ul","product_id":7245574144198,"page":2,"type":"load-more","order":"late","filter":"all","feature":false,"domain":"tornado-fans.myshopify.com","platform":"shopify"}' \
  --compressed

But if I use Python requests to make the same API call, the requests never succeeds. The code I am using to do this is following -

import requests

url = "https://app.ryviu.io/frontend/client/get-more-reviews?domain=tornado-fans.myshopify.com"

payload='%7B%22handle%22%3A%22tornado-24-high-velocity-metal-portable-tilt-blower-drum-fan-yellow-8540-cfm-ul%22%2C%22product_id%22%3A7245574144198%2C%22page%22%3A2%2C%22type%22%3A%22load-more%22%2C%22order%22%3A%22late%22%2C%22filter%22%3A%22all%22%2C%22feature%22%3Afalse%2C%22domain%22%3A%22tornado-fans.myshopify.com%22%2C%22platform%22%3A%22shopify%22%7D='
headers = {
  'authority': 'app.ryviu.io',
  'accept': 'application/json, text/plain, */*',
  'accept-language': 'en-US,en;q=0.9,hi;q=0.8,de;q=0.7,ur;q=0.6,pa;q=0.5,es;q=0.4',
  'content-type': 'application/x-www-form-urlencoded',
  'dnt': '1',
  'origin': 'https://www.tornadofans.com',
  'referer': 'https://www.tornadofans.com/',
  'sec-ch-ua': '"Chromium";v="110", "Not A(Brand";v="24", "Google Chrome";v="110"',
  'sec-ch-ua-mobile': '?0',
  'sec-ch-ua-platform': '"macOS"',
  'sec-fetch-dest': 'empty',
  'sec-fetch-mode': 'cors',
  'sec-fetch-site': 'cross-site',
  'user-agent': 'Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_7) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/110.0.0.0 Safari/537.36'
}

response = requests.request("POST", url, headers=headers, data=payload)

print(response.text)

With the python request, I am getting the response:

{"msg":"not found"}

HELP!

I want to know what is the reason curl works but no Python library works.

I have tried everything, I can.


Solution

  • The problem is probably your payload not encoded correctly. Taking the dictionnary from your first example, and using json= instead of data= to provide the payload to the POST request (more on that here), I got a positive result:

    import requests
    
    url = "https://app.ryviu.io/frontend/client/get-more-reviews?domain=tornado-fans.myshopify.com"
    
    payload = {
        "handle": "tornado-24-high-velocity-metal-portable-tilt-blower-drum-fan-yellow-8540-cfm-ul",
        "product_id": 7245574144198,
        "page": 2,
        "type": "load-more",
        "order": "late",
        "filter": "all",
        "feature": False,
        "domain": "tornado-fans.myshopify.com",
        "platform": "shopify"
    }
    headers = {
      'authority': 'app.ryviu.io',
      'accept': 'application/json, text/plain, */*',
      'accept-language': 'en-US,en;q=0.9,hi;q=0.8,de;q=0.7,ur;q=0.6,pa;q=0.5,es;q=0.4',
      'content-type': 'application/x-www-form-urlencoded',
      'dnt': '1',
      'origin': 'https://www.tornadofans.com',
      'referer': 'https://www.tornadofans.com/',
      'sec-ch-ua': '"Chromium";v="110", "Not A(Brand";v="24", "Google Chrome";v="110"',
      'sec-ch-ua-mobile': '?0',
      'sec-ch-ua-platform': '"macOS"',
      'sec-fetch-dest': 'empty',
      'sec-fetch-mode': 'cors',
      'sec-fetch-site': 'cross-site',
      'user-agent': 'Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_7) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/110.0.0.0 Safari/537.36'
    }
    
    response = requests.request("POST", url, headers=headers, json=payload)
    
    print(response.status_code) # returns 200