Search code examples
regexperlhtml-treebuilder

Perl HTML:TreeBuilder tag not equal to


I am using HTML::TreeBuilder in order to extract data from html file. What I need to do is to:

$div->look_down(_tag => 'a', 'href' !=> 'index.html')

So I am searching for a href that is not equal to 'index.html' and one other tag but obviously !=> is not proper command for HTML::TreeBuilder. How can I achieve something like that? Can I use regular expression?

BR


Solution

  • There is no "not equal", but you can use a regex that matches anything but that string, like this

    $div->look_down( _tag => 'a', href => qr/\A(?!index\.html\z)/i )
    

    or you could write a subroutine that makes the check

    $div->look_down( _tag => 'a', sub { lc $_[0]->attr('href') ne 'index.html' } )