I have following issue.
When I run this scrappy shell:
scrapy shell "http://en.50partners.fr/Startups/"
I expect to retrieve the full page, unfortunately when I run view(response)
I'm retrieving the page without the startups section itself. Do you have any idea, how to fix this issue?
Thanks.
The part with the startups is loaded dynamically.
Try to open the initial page in a browser of your choice with JavaScript turned off and you'll get the same result.
Now inspect the HTML of this page to see this:
<div class="Folder_page_block startups"
data-children-count="46"
data-children-reload-url="http://en.50partners.fr/fiftyPartners/ajax/folder/67/children/%page%/%limit%/%view%"
data-children-view="line">
There's the url from where the data is loaded. You might want to fiddle a bit with the url, strip everything after "children" and start another Request with this URL.
The resulting response isn't the HTML you might expect. You might want to import json
, run json.load(response.text)
, and inspect the resulting list.
Have fun :)