Search code examples
pythonpython-3.xscrapyscrapy-shell

Post request with scrapy not redirecting properly?


I'm trying to extract some data from http://www.bcpa.com using scrapy. I have some addresses and I want to extract from the website the info associated to each one of the addresses, so I need to "search by address" through this urls http://www.bcpa.net/RecAddr.asp

I tried with 8433 as Street Number, and LAKEVIEW as street name, and the site redirect me to this URL: http://www.bcpa.net/RecInfo.asp?URL_Folio=474128020500, which is the one I want. But, as you can see, the info I used for the search is not in the resulting url. I checked the page with the inspector and I get something like this:

enter image description here

So, I did a post request using scrapy, and passing the parameters as follow:

>>> from scrapy.http import FormRequest
>>> form_data = {"Situs_Street_Number":"8433", "Situs_Street_Name":"LAKEVIEW"}
>>> url = "http://www.bcpa.net/RecSearch.asp"
>>> r = FormRequest(url, method = "POST", formdata = form_data)
>>> fetch(r)
2017-02-16 08:22:38 [scrapy.core.engine] INFO: Spider opened
2017-02-16 08:22:40 [scrapy.core.engine] DEBUG: Crawled (200) <GET http://www.bcpa.net/robots.txt> (referer: None)
2017-02-16 08:22:41 [scrapy.downloadermiddlewares.redirect] DEBUG: Redirecting (302) to <GET http://www.bcpa.net/RecMenu.asp> from <POST http://www.bcpa.net/RecSearch.asp>
2017-02-16 08:22:41 [scrapy.core.engine] DEBUG: Crawled (200) <GET http://www.bcpa.net/RecMenu.asp> (referer: None)
>>> 

As you can see, It didn't work, the site redirect me to the original page. I don't know why. Any idea?


Solution

  • In your picture do you see Request Headers?

    You must have to send same headers along with your POST request, and it should work.