I'm trying to programatically (in Python) retrieve account information from this website for a list of properties I have (identified by BRT number).
This should be very simple, and I've read a few things I've found via Google, but it's all way over my head as I've no web development experience so all the vernacular is in-one-ear-out-the-other.
The procedure should be very simple, as the web page seems very no-frills:
Set brt
, e.g. 883309000
.
Open the url: http://www.phila.gov/revenue/RealEstateTax/default.aspx
.
Select the by BRT Number
field and enter brt
.
Click the >>
button to retrieve property info.
Scrape the bottom line (TOTALS
) and the accurate-to date, in this case:
TOTALS $13,359.83 $2,539.14 $1,417.73 $1,645.59 $18,962.29
and
06/30/2015
I'm principally stuck on steps 3 and 4. I've gotten as far as:
import mechanize
from bs4 import BeautifulSoup
br = mechanize.Browser()
br.addheaders = [('User-agent', 'Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/41.0.2272.118 Safari/537.36')]
br.open('http://www.phila.gov/revenue/RealEstateTax/default.aspx')
soup = BeautifulSoup(br.response().read())
#Here's the BRT Number field
soup.find("input",{"id":"ctl00_BodyContentPlaceHolder_SearchByBRTControl_txtTaxInfo"})
#Here's the "Lookup by BRT" button
soup.find("input",{"id":"ctl00_BodyContentPlaceHolder_SearchByBRTControl_btnTaxByBRT"})
But I am really lost on what to do from there. Any help would be appreciated.
Have you considered using the selenium package for python. The documentation for this is here, I strongly suggest you read this through, run a few basic tests to check your understanding and skim it through again before starting.
The point of Selenium is to load the page as you would in your browser and perform commands (which you can automate using python code).
First import selenim:
from selenium import webdriver
from selenium.webdriver.common.keys import Keys
Then begin the webdriver and load the page, 'assert' will check that the page has "Revenue Department" in the title before proceeding.
driver = webdriver.Firefox()
driver.get("http://www.phila.gov/revenue/RealEstateTax/default.aspx")
assert "Revenue Department" in driver.title
Following this we need to select the BRT input box and send keys brt
driver.find_element_by_id("ctl00_BodyContentPlaceHolder_SearchByBRTControl_txtTaxInfo").send_keys(brt)
Finally we need to push the >> button
driver.find_element_by_id("ctl00_BodyContentPlaceHolder_SearchByBRTControl_btnTaxByBRT").click()
Now you should be taken to the page of results