I am trying to scrape the data in a bunch of rows. I am able to expand an individual row using the following:
WebDriverWait(driver, 10).until(EC.presence_of_element_located((By.XPATH, '//*[@id="7858101"]'))).click()
The problem is each row has a different id. They have common class name so I have also tried:
WebDriverWait(driver, 60).until(EC.presence_of_elements_located((By.CLASS_NAME, 'course-row normal faculty-BU active'))).click()
I have attached a few rows below Any suggestions on how I can fix this
<tr id="7858101" class="course-row normal faculty-BU active" data-cid="7858101" data-cc="ACTG1P01" data-year="2021" data-session="FW" data-type="UG" data-subtype="UG" data-level="Year1" data-fn2_notes="BB" data-duration="2" data-class_type="ASY" data-course_section="1" data-days=" " data-class_time="" data-room1="ASYNC" data-room2="" data-location="ASYNC" data-location_desc="" data-instructor="Zhang, Xia (Celine)" data-msg="0" data-main_flag="1" data-secondary_type="E" data-startdate="1631073600" data-enddate="1638853200" data-faculty_code="BU" data-faculty_desc="Goodman School of Business">
<td class="arrow"><span class="fa fa-angle-down"></span></td>
<td class="course-code">ACTG 1P01 </td>
<td class="title"><a href="#" data-cc="ACTG1P01" data-cid="7858101">Introduction to Financial Accounting</a> <div class="details-loader" style="display: none;"><span class="fa fa-refresh fa-spin fa-fw"></span></div></td>
<td class="duration">D2</td>
<td class="days"> </td>
<td class="time"> </td>
<!-- <td class="start" data-sort-value="1631073600">Sep 08, 2021</td> -->
<!-- <td class="end" data-sort-value="1638853200">Dec 07, 2021</td> -->
<td class="type">ASY</td>
<td class="data"><div style="" class="course-details-data">
<div class="description">
<h3>Introduction to Financial Accounting</h3>
<p class="page-intro">Fundamental concepts of financial accounting as related to the balance sheet, income statement and statement of cash flows. Understanding the accounting cycle and routine transactions. Integrates both theoretical and practical application of accounting concepts.</p>
<p><strong>Format:</strong> Lectures, discussion, 3 hours per week.</p>
<p><strong>Restrictions:</strong> open to BAcc majors.</p>
<p><strong>Exclusions:</strong> Completion of this course will replace previous assigned grade and credit obtained in ACTG 1P11, 1P91 and 2P51.</p>
<p><strong>Notes:</strong> Open to Bachelor of Accounting majors. </p>
</div>
<div class="vitals">
<ul>
<li><strong>Duration:</strong> Sep 08, 2021 to Dec 07, 2021</li>
<li>
<strong>Location:</strong> ASYNC </li>
<li><strong>Instructor:</strong> Zhang, Xia (Celine)</li>
<li><strong>Section:</strong> 1</li>
</ul>
</div>
<hr>
</div>
</td>
</tr>
<tr id="3724102" class="course-row normal faculty-BU active" data-cid="3724102" data-cc="ACTG1P01" data-year="2021" data-session="FW" data-type="UG" data-subtype="UG" data-level="Year1" data-fn2_notes="BB" data-duration="2" data-class_type="LEC" data-course_section="2" data-days=" M R " data-class_time="1100-1230" data-room1="GSB306" data-room2="" data-location="GSB306" data-location_desc="" data-instructor="Zhang, Xia (Celine)" data-msg="0" data-main_flag="1" data-secondary_type="E" data-startdate="1631073600" data-enddate="1638853200" data-faculty_code="BU" data-faculty_desc="Goodman School of Business">
<td class="arrow"><span class="fa fa-angle-right"></span></td>
<td class="course-code">ACTG 1P01 </td>
<td class="title"><a href="#" data-cc="ACTG1P01" data-cid="3724102">Introduction to Financial Accounting</a> <div class="details-loader"><span class="fa fa-refresh fa-spin fa-fw"></span></div></td>
<td class="duration">D2</td>
<td class="days">
<table class="coursecal">
<thead>
<tr>
<th class="">S</th>
<th class="active">M</th>
<th class="">T</th>
<th class="">W</th>
<th class="active">T</th>
<th class="">F</th>
<th class="">S</th>
</tr>
</thead>
<tbody>
<tr>
<td class="weekend "></td>
<td class="active"></td>
<td class=""></td>
<td class=""></td>
<td class="active"></td>
<td class=""></td>
<td class="weekend "></td>
</tr>
</tbody>
</table>
</td>
<td class="time">1100-1230</td>
<!-- <td class="start" data-sort-value="1631073600">Sep 08, 2021</td> -->
<!-- <td class="end" data-sort-value="1638853200">Dec 07, 2021</td> -->
<td class="type">LEC</td>
<td class="data"></td>
</tr>
Are almost there...
You can retrieve a list of all the relevant web elements with the use of driver.find_elements
method and then to iterate over each element in the list clicking on it.
Since course-row normal faculty-BU active
is actually several class names, not a single class name, you should use XPath or CSS Selector there.
Also it's recommended to use visibility_of_element_located
expected condition here, not presence_of_elements_located
since the former condition is fulfilled even when the web element is not finally rendered on the page while visibility_of_element_located
expected condition waits for more mature state of the web element
WebDriverWait(driver, 60).until(EC.visibility_of_element_located((By.XPATH, '//tr[@class = "course-row normal faculty-BU active"]')))
time.sleep(0.4) #short delay added to make ALL the elements loaded
elements = driver.find_element(By.XPATH, '//tr[@class = "course-row normal faculty-BU active"]')
for element in elements:
element.click()
#scrape the data you need here etc