I have to scrape the data from this web page: http://www.mlsa.am/?page_id=368.The webpage is in Armenian. This is a dropdown menu where the options are: Regions, Areas, Communities, Type of Subsidy, Month and Year. Once these options are selected a table shows up with information on the citizens of these places who get the different kinds of subsidies. The difficulty I am facing right now is that the second dropdown (Areas) depends on the option you select on the first dropdown and the third (Communities) depends and what you select on the previous dropdowns. How should I write my code for this type of web page?
This is how the web page looks like when you inspect it
<!--Մարզեր-->
<div class="td-pb-row">
<div class="td-pb-span2"></div>
<div class="td-pb-span5">
Մարզեր <span class="ben-required">*</span>
<select id="ref_regions_id" name="ref_regions" style="border:1px solid #0790A2;" >
<option value="0" > Ընտրել </option>
<option value="1"> ԱՐԱԳԱԾՈՏՆ</option>
value="2"> ԱՐԱՐԱՏ</option>
<option value="3"> ԱՐՄԱՎԻՐ</option>
<option value="4"> ԳԵՂԱՐՔՈՒՆԻՔ</option>
<option value="5"> ԼՈՌԻ</option>
<option value="6"> ԿՈՏԱՅՔ</option>
<option value="7"> ՇԻՐԱԿ</option>
<option value="8"> ՍՅՈՒՆԻՔ</option>
<option value="9"> ՎԱՅՈՑ ՁՈՐ</option>
<option value="10"> ՏԱՎՈՒՇ</option>
<option value="11"> ԵՐԵՎԱՆ</option>`
</select>
</div>
I am using selenium with python and so far this is my code:
import time
import requests
from selenium import webdriver
from selenium.common.exceptions import WebDriverException
from selenium.common.exceptions import TimeoutException
from selenium.webdriver.common.keys import Keys
chrome_path = r"C:\Users\ivrav\selenium-2.25.0\Driver\chromedriver.exe"
driver = webdriver.Chrome(chrome_path)
print("loading url into browser...")
def get_all_pages():
payload={'value':'1'}
driver.get("http://www.mlsa.am/?page_id=368")
print(url.text)
time.sleep(2)
To select an option from the dropdown you showed in your HTML, I would use the Select()
class in Python:
from selenium.webdriver.support.ui import Select
select = Select(driver.find_element_by_id('ref_regions_id'))
Then, you can select an option as such:
select.select_by_text("ԱՐՄԱՎԻՐ")
Or, using the value
attributes on the option elements:
select.select_by_value(0)
Lastly, you can get all available options in the dropdown:
options = select.options
for option in options:
print(option)
To work with each dropdown depending on the previous dropdown, you'll just need to select an option for each dropdown in the correct order against a pre-determined set of options. Each dropdown has a unique ID, so that should help.