Search code examples
pythonweb-scrapingscrapyform-datascrapy-shell

Scrapy FormRequest can't handle complex dicts as formdata


I am trying to provide formdata to a scrapy.FormRequest object. The formdata is a dict of the following structure:

{
  "param1": [
    {
      "paramA": "valueA",
      "paramB": "valueB"
    }
  ]
}

via equivalent to the following code, run in scrapy shell:

from scrapy import FormRequest

url = 'www.example.com'
method_post = 'POST'
formdata = <the above dict>

fr = FormRequest(url=url, method=method_post, formdata=formdata)

fetch(fr)

and in response I get the following error:

Traceback (most recent call last):
  File "<console>", line 1, in <module>
  File "/Users/chhk/.local/share/virtualenvs/project/lib/python3.6/site-packages/scrapy/http/request/form.py", line 31, in __init__
    querystr = _urlencode(items, self.encoding)
  File "/Users/chhk/.local/share/virtualenvs/project/lib/python3.6/site-packages/scrapy/http/request/form.py", line 66, in _urlencode
    for k, vs in seq
  File "/Users/chhk/.local/share/virtualenvs/project/lib/python3.6/site-packages/scrapy/http/request/form.py", line 67, in <listcomp>
    for v in (vs if is_listlike(vs) else [vs])]
  File "/Users/chhk/.local/share/virtualenvs/project/lib/python3.6/site-packages/scrapy/utils/python.py", line 119, in to_bytes
    'object, got %s' % type(text).__name__)
TypeError: to_bytes must receive a unicode, str or bytes object, got dict

I have tried a variety of solutions, including the whole thing as a string, with various escape characters, and variations on the dict to make it more agreeable, but none of the solutions that remove this error work for the request (I get a 400 response).

I know that the formdata and that everything else I am doing is correct, in that I have replicated it successfully in curl (formdata was provided via -d formdata.txt).

Is there a way around FormRequest's inability to deal with complex dict structures? Or am I missing something?


Solution

  • Instead of formdata you can try to use body parameter. Example:

    FormRequest(url=url, method=method_post, body=json.dumps(formdata))