Search code examples
htmlxmlxpathxpath-2.0domxpath

XPath Query to get the <a> tag and <b> tag


I tried to get the values from and tag using Xpath query whole day and i couldn't. Can somebody help me with xpath query i need to use in order to get them. Please see the below html code.

<html class="chrome webkit">
#shadow-root
<head>...</head>
<body id="jira" class="aui-layout aui-theme-default page-type-dashboard" data-version="6.1.2" data-aui-version="5.1.6">
<div id="page">
<header id="header" role="banner">...</header>
<div id="announcement-banner" class="alertHeader">
  <b> Production </b>
  <marquee scrollamoun="3" behaviour="alternate" onmouseover="this.stop()" onmouseout ="this.start()">..</marquee>
<#shadow-root
<font-color="red"> Note:Please check out 1.</font>
<a href="https://docs.google.com/a/query.com/document/" target="_blank">
 <b>
  <font color ="red"> GSD Service </font>
 </b>
</a>
</marquee>
</div>
<section id="content" role="main">...</section>
<footer id="footer" role="contentinfo">...</footer>
</div>
<div class="shim hidden"></div>
<div class="shim hidden"></div>
<div class="shim hidden"></div>
<div class="shim hidden"></div>
<div class="shim hidden"></div>
<div class="shim hidden"></div>
</body>
</html>

Similarly i have another three a tags after this tag , so i wanted to get all the a tags as well as b tags separately to show it in my application.Please help me with the XPath Query.


Solution

  • Assuming that your HTML is well-formed, the following XPath will select all a elements:

    //a
    

    Just the first a:

    (//a)[1]
    

    Just the first a within the div whose @id is page:

    (//div[@id='page']//a)[1]
    

    You can equally easily apply these concepts to selecting b.

    Update

    The following XPath will select all a elements you indicated in the comments that you need:

    //div[@id='page']//div[@id='announcement-banner']//a[@target='_blank']
    

    Notes:

    • While your comment asked for target="_blank", the a in your posted HTML has target="_blank", so you may need to adjust.
    • If you want immediate containment rather than containment at any depth, use / rather than //.