I am using scrapy in order to crawl a web site.
with open('test.html', 'wb') as f:
f.write(response.body)
With this block I am writing body to a file. When I open the file I can see many "a" tag.
When I print the same thing with print. It shows only two "a" tags
print response.body
Do you have any idea what is happening here?
I have solved the problem. The crawled website has second <html>
tag in a combobox.
I was using PyQuery and if there is any problem with tags in html structure PyQuery is not working.
Now I have changed my selector to xpath and now it finds all a tags in the html.