Search code examples
htmlxpathdomxpath

xpath for extracting text from self and child node


here is my situation

i want to select "Buy 2 Hills Feline Maint Light 10kg and Save a further £4.00!" only from bellow html

Note: i am using XPath 1.0

<div>
    <a>
        <b>
            <u>Multi-Buy:</u>
        </b>
        <br/>
        Buy 
        <b>2</b>
         Hills Feline Maint Light 10kg and 
        <b>
            <font color="#CC0000">Save a further £4.00!</font>
        </b>
        <br/>
        <i>Simply add 2 to your basket.</i>
    </a>
</div>

here is my effort

//div/a/text()

by using this i am missing child node text

/div/a//text()

if i use this i am getting extra text


Solution

  • Since this HTML is not structured in any way that would facilitate extracting this in any clean way, I would propose the following:

    /div/a//text()[not(. = 'Multi-Buy:' or contains(., 'to your basket'))]