Description:
I am trying to make a job ad parser which works on the indeed.com site (I am using python + selenium + chromedriver)
I am able to login with my facebook credentials and then, I am redirected to the default site which is hu.indeed.com (as I am living in Hungary).
I would like to search for jobs available in London, therefore get selenium driver to change to the uk.indeed.com site.
Then I get selenium to locate and input my job search criteria in the position input field and the locality as well in the locality field. Up untill now everything works smoothly.
The problem:
After pressing the search button I am able to see the results window, but after a very short time I am automatically redirected to the hu.indeed.com site. As you can see from my code below, I have no such commands, I have no clue whatsoever why and how this is happening. My print statements show that driver.current_url changes at a moment in time and I dont understand why is that happening and how could I prevent that.
Could you please let me know why does the url change and how could I prevent that?
Code:
driver.get("https://uk.indeed.com/")
time.sleep(1)
job_type_input=driver.find_element_by_xpath('//*[@id="text-input-what"]')
search_text=f"{jobs[0]} {extra_info}"
job_type_input.send_keys(search_text)
time.sleep(1)
print(f"1 print:{driver.current_url}") #<--- 1. print
job_location_input=driver.find_element_by_xpath('//*[@id="text-input-where"]')
job_location_input.send_keys(cities[0])
search_button=driver.find_element_by_xpath('//*[@id="jobsearch"]/button')
search_button.click()
time.sleep(5)
print(f"2 print:{driver.current_url}") #<--- 2. print
print(f"3 print:{driver.current_url}") #<--- 3. print
try:
moaic_element=driver.find_element_by_id("mosaic-provider-jobcards")
html=mosaic_element.get_attribute('innerHTML')
print("success")
except:
print("error in try")
print(f"4 print:{driver.current_url}") #<--- 4. print
Output:
1 print:https://uk.indeed.com/
2 print:https://hu.indeed.com/
3 print:https://hu.indeed.com/
error in try
4 print:https://hu.indeed.com/
I am the one who wrote the original post and found I found the solution to this problem. As Max Daroshchanka mentioned in his answer, the problem was claused by indeed.com as it reloaded due to some plugin (or something). Therefore my solution was to use the input field only after some time passed (using time.sleep(2))