Search code examples
google-chromeselenium-webdriverselenium-chromedrivergoogle-chrome-headlessweb-testing

Headless Chrome with Selenium not loading the web-page elements correctly


I am loading a website for web-scraping using Selenium WebDriver with Python.
I have to load a table from it which works perfectly fine when not using headless = True.
The table itself loads actually, but shows 'No results found' in this case, while fetches data elements otherwise.

I have tried testing if the code works fine by disabling headless mode. It works like a charm, loads the table correctly with complete elements every time. As soon as I use headless, it misses out the table data.
(Mind well it still loads the table and its headers, it shows 'No results found' instead of data elements)
I also tried faking headed user with argument 'user=some headed user'
I have also tried enabling/disabling a bunch of chrome options such as
disable gpu; start with maximized screen; change screen size; bypass proxy
and everything else that is generally used to debug headless chrome options.

Following is the code:

from selenium import webdriver
from selenium.webdriver.chrome.options import Options
from selenium.webdriver.common.keys import Keys

service = webdriver.chrome.service.Service(r'C:/Program Files (x86)/SeleniumWrapper/chromedriver.exe')
service.start()
chrome_options = Options()

chrome_options.add_argument("user-agent=Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/61.0.3163.100 Safari/537.36")
chrome_options.add_argument("--disable-gpu")
chrome_options.headless = True

driver = webdriver.Remote(service.service_url, desired_capabilities=chrome_options.to_capabilities())

driver.get('https://cambodiantr.gov.kh/index.php?r=searchMeasures/index')

table = driver.find_element_by_xpath('//*[@id="measures-grid"]/table')
all_rows = table.find_elements_by_tag_name('tr')
print(all_rows[0].text)
print(all_rows[1].text)

Results:

-Without headless mode:
| Name - Enforced By - Type - Validity From - Validity To |
| A suspension on the clearance of imported goods may be applied in the case where there is an objection lodged against a registered owner's mark - Ministry of Agriculture, Forestry and Fisheries - Prohibition - 14-01-2012 - 31-12-9999 |

-With headless mode:
| Name - Enforced By - Type - Validity From - Validity To |
| No results found. |


Solution

  • On some more research, I discovered adding

    chrome_options.add_argument('--lang=en_US') 
    

    this code snippet to my program does the trick.

    Headless chrome does not support all incoming languages and thus some pages do not respond well to that. Supporting the language that page outputs in, loads the page correctly.