Search code examples
pythonscrapypyquery

Different Output From Same PyQuery Object


I am using scrapy in order to crawl a web site.

with open('test.html', 'wb') as f:
        f.write(response.body)

With this block I am writing body to a file. When I open the file I can see many "a" tag.

When I print the same thing with print. It shows only two "a" tags

print response.body

Do you have any idea what is happening here?


Solution

  • I have solved the problem. The crawled website has second <html> tag in a combobox.

    I was using PyQuery and if there is any problem with tags in html structure PyQuery is not working.

    Now I have changed my selector to xpath and now it finds all a tags in the html.