python selenium selenium-webdriver web-scraping scraper

Selenium webdriver with python to scrape dynamic page cannot find element

So there are a lot of questions that have been asked around dynamic content scraping on stackoverflow, and I went through all of these, but all the solutions suggested did not work for the following problem:

Context:

Using Selenium webdriver with python
I mostly used this resource: http://selenium-python.readthedocs.org/page-objects.html regarding the Python.org example.
Page to scrape: http://propertymap.sfplanning.org/

Issue:

I have not been able to access any of the DOM elements on this page. Note if I could get some hints on how to access the search bar, and the search button, that would be a great start. See page to scrape What I want in the end, is to go through a list of addresses, launch the search, and copy the information displayed on the right hand side of the screen.

I have tried the following:

Changed the browser for webdriver (from Chrome to Firefox)

Added waiting time for the page to load

try:
    WebDriverWait(self.driver, 10).until(EC.presence_of_element_located((By.ID, "addressInput")))
except:
    print "address input not found"

Tried to access the item by ID, XPATH, NAME, TAG NAME, etc., nothing worked.

Questions

What else could I try that I have not so far (using Selenium webdriver)?
Are some websites really impossible to scrape? (I don't think that the city used an algorithm to generate any random DOM everytime I re-load the page).

Solution

You can use this url http://50.17.237.182/PIM/ to get the source:

In [73]: from selenium import webdriver


In [74]: dr = webdriver.PhantomJS()

In [75]: dr.get("http://50.17.237.182/PIM/")

In [76]: print(dr.find_element_by_id("addressInput"))
<selenium.webdriver.remote.webelement.WebElement object at 0x7f4d21c80950>

If you look at the source returned, there is a frame attribute with that src url:

<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01//EN"
   "http://www.w3.org/TR/html4/strict.dtd">
<html>

<head>
  <title>San Francisco Property Information Map </title>
  <META name="description" content="Public access to useful property information and resources at the click of a mouse"><META name="keywords" content="san francisco, property, information, map, public, zoning, preservation, projects, permits, complaints, appeals">
</head>
<frameset rows="100%,*" border="0">
  <frame src="http://50.17.237.182/PIM" frameborder="0" />
  <frame frameborder="0" noresize />
</frameset>

<!-- pageok -->
<!-- 02 -->
<!-- -->
</html>

Thanks to @Alecxe, the simplest method it to use dr.switch_to.frame(0):

In [77]: dr = webdriver.PhantomJS()

In [78]: dr.get("http://propertymap.sfplanning.org/")

In [79]:  dr.switch_to.frame(0)  

In [80]: print(dr.find_element_by_id("addressInput"))
<selenium.webdriver.remote.webelement.WebElement object at 0x7f4d21c80190>

If you visit http://50.17.237.182/PIM/ in your browser, you will see exactly the same as propertymap.sfplanning.org/, the only difference is you have full access to the elements using the former.

If you want to input a value and click the search box, it is something like:

from selenium import webdriver


dr = webdriver.PhantomJS()
dr.get("http://propertymap.sfplanning.org/")

dr.switch_to.frame(0)

dr.find_element_by_id("addressInput").send_keys("whatever")
dr.find_element_by_xpath("//input[@title='Search button']").click()

But if you want to pull data, you may find querying using the url an easier option, you will get some json back from the query.