So there are a lot of questions that have been asked around dynamic content scraping on stackoverflow, and I went through all of these, but all the solutions suggested did not work for the following problem:
I have not been able to access any of the DOM elements on this page. Note if I could get some hints on how to access the search bar, and the search button, that would be a great start. See page to scrape What I want in the end, is to go through a list of addresses, launch the search, and copy the information displayed on the right hand side of the screen.
I have tried the following:
Added waiting time for the page to load
try:
WebDriverWait(self.driver, 10).until(EC.presence_of_element_located((By.ID, "addressInput")))
except:
print "address input not found"
Questions
You can use this url http://50.17.237.182/PIM/
to get the source:
In [73]: from selenium import webdriver
In [74]: dr = webdriver.PhantomJS()
In [75]: dr.get("http://50.17.237.182/PIM/")
In [76]: print(dr.find_element_by_id("addressInput"))
<selenium.webdriver.remote.webelement.WebElement object at 0x7f4d21c80950>
If you look at the source returned, there is a frame attribute with that src url:
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01//EN"
"http://www.w3.org/TR/html4/strict.dtd">
<html>
<head>
<title>San Francisco Property Information Map </title>
<META name="description" content="Public access to useful property information and resources at the click of a mouse"><META name="keywords" content="san francisco, property, information, map, public, zoning, preservation, projects, permits, complaints, appeals">
</head>
<frameset rows="100%,*" border="0">
<frame src="http://50.17.237.182/PIM" frameborder="0" />
<frame frameborder="0" noresize />
</frameset>
<!-- pageok -->
<!-- 02 -->
<!-- -->
</html>
Thanks to @Alecxe, the simplest method it to use dr.switch_to.frame(0)
:
In [77]: dr = webdriver.PhantomJS()
In [78]: dr.get("http://propertymap.sfplanning.org/")
In [79]: dr.switch_to.frame(0)
In [80]: print(dr.find_element_by_id("addressInput"))
<selenium.webdriver.remote.webelement.WebElement object at 0x7f4d21c80190>
If you visit http://50.17.237.182/PIM/
in your browser, you will see exactly the same as propertymap.sfplanning.org/
, the only difference is you have full access to the elements using the former.
If you want to input a value and click the search box, it is something like:
from selenium import webdriver
dr = webdriver.PhantomJS()
dr.get("http://propertymap.sfplanning.org/")
dr.switch_to.frame(0)
dr.find_element_by_id("addressInput").send_keys("whatever")
dr.find_element_by_xpath("//input[@title='Search button']").click()
But if you want to pull data, you may find querying using the url an easier option, you will get some json back from the query.