i want to extract data 'AT401726' from the html given below
<td class="publicationInfoColumn">
<h4>Publication info:</h4>
AT401726<br>2008-08-15
</td>
& i solved it by using JQuery, the working code is given below
('body').find('.publicationInfoColumn').clone().children().remove().end().text()
is there any other better technique to extract data from above given html ? there are many html like above in my crawled html page
The text you are looking for the the contents of the next sibling element of the h4 element, so try
var text = $.trim($('.publicationInfoColumn h4').prop('nextSibling').nodeValue);
console.log(text)
Demo: Fiddle