I am using Python2 & using urlopen from urllib2 and BeautifulSoup from bs4 to scrap the HTML code of a few variations of the same product listing.
Namely: https://www.amazon.com/Mouse-Pad-Star-Wars-V4/dp/B00TGGVHOW
So when I scrap all the different variations of this same product listing, I am receiving the same exact HTML code back from Var1.
There are 9 total variations and the code returned is the same as the first variation.
It is very weird because if I visit the direct links and inspecting the source, I get different HTML but if its is being scraped using Python, it is getting identical HTML.
Can someone please take a look at this and guide me in the right direction? Much appreciated!
Just to add some information, Mr.sytech brought up a very good point. However, this issue is not occurring for every product but rather only happening to some products. If we take a look at this product: https://www.amazon.com/VicTsing-Wireless-Portable-Receiver-Adjustable/dp/B013WC0P2A it is working as intended and every variation is getting their own unique HTML returned.
You can't expect what urllib2 retrieves to be identical to what you see in your browser. Chances are, the display of variations are controlled via JavaScript. Since urllib2 simply retrieves the server's response containing the HTML, it does not execute JavaScript or anything else that your browser would do.
There may be other options for sourcing the data with urllib, but you can also use browser automation, such as with selenium
which can get you the DOM as it appears after executing JS and everything else that the browser does for you.