Search code examples
htmlweb-scrapingxpathmultiple-axes

XPath for parent's sibling descendants


I have the following HTML I need to scrape, but the only reliable handle is a stable description of a text field. From there, I need to go to its parent, find that parents next sibling and then get the descendents (unfortunately the data-automation-id selector repeats in every such iteration of this snippet on the site). I put together the below XPath but my RPA tool is unable to find it in the document.

XPath

div[contains(text(),'STABLE TEXT HANDLE')]/following-sibling::div/div/div/span[data-automation-id="SOMETHING"]

HTML:

<ul>
   <li>
      <div>
          <label>STABLE TEXT HANDLE</label>
      </div>
      <div>
          <div>
              <div>
                  <span></span>
                  <span data-automation-id="something">
                      <div>
                          <div>
                              <div>
                                  DYNAMIC TEXT I WANT TO SCRAPE
                              </div>
                          </div>
                      </div>
                  </span>
                  <span data-automation-id="somethingelse">
                      <div>
                          <div>
                              <div>
                                  DYNAMIC TEXT I WANT TO SCRAPE
                              </div>
                          </div>
                      </div>
                  </span>
              </div>
          </div>
      </div>
   </li>
</ul>

EDIT:

After futher testing, it seems the issue starts with the contains(text(),'STABLE TEXT HANDLE'), which fails to find that particular node (be it the label, or its parent div).


Solution

  • Please try this:

    //label[contains(text(),'STABLE TEXT HANDLE')]/../..//span[@data-automation-id="something"]