Search code examples
pythonscrapypython-requestshttp-status-code-400

Scrapy POST request not working - 400 Bad Request


I am moving from python's requests library to scrapy, and I am having problems making a simple POST request. I am setting headers and payload as such:

headers = {
    'Accept':'*/*',
    'Accept-Encoding':'gzip, deflate, br',
    'accept-language':'en_US',
    'Connection':'keep-alive',
    'Content-Length':'151',
    'content-type':'application/json',
    'Cookie':cookie,
    'Host':host,
    'Origin':origin,
    'Referer':referer,
    'User-Agent':'Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/55.0.2883.87 Safari/537.36',
    'x-csrf-token':token
}

payload = {"targetLocation":{"latitude":lat,"longitude":lng}}

And then making the request like this:

def start_requests(self):
    u = self.url
    yield scrapy.Request(u, method='POST',
                            callback=self.parse_httpbin,
                            errback=self.errback_httpbin,
                            body=json.dumps(self.payload),
                            headers=self.headers)

And that keeps on giving me 400 status. If I make the request using the exact same headers and payload with the requests library, it gives me 200 status and returns a json, as expected.

r = requests.post(url, headers=headers, data=json.dumps(payload), verify=False)

What am I doing wrong?


Solution

  • A couple of these headers that you have in your request are not advisable for using general purpose HTTP libraries. Most libraries will generate these themselves:

    • Host
    • Content-Length

    Specifically, the HTTP RFCs specify very clearly that any time a Content-Length header is sent more than once (which Scrapy might be doing) then the response must be a 400. Requests, likely doesn't set it's own Content-Length header and defers to yours.