Search code examples
rubynokogiriscraperopen-uri

nokogiri select paragraph with text match


So i wrote a scraper and i am trying to only get the text of the paragraph that includes On Snow Feel

I am trying to pull that out, but im not sure how to have nokogiri pull out the paragraph that has something match text.

At the moment i have boards[:onthesnowfeel] = html.css(".reviewfold p").text but this captures all the paragraphs. And dont assume the paragraphs will be in order all the time. So cant just do [2] or something.

But what method would you use to scrape the paragraph that matches the text "On Snow Feel"

<div id="review" class="reviewfold">
<p>The <strong>Salomon A</strong><b>assassin</b>&nbsp;Lorem ipsum dolor sit amet, consectetur adipiscing elit, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua. </p>
<p><b>Approximate Weight</b>: Moew mix is pretty normal</p>
<p><strong>On Snow Feel:&nbsp;</strong>At vero eos et accusamus et iusto odio dignissimos ducimus qui blanditiis praesentium voluptatum.</p>
<p><strong>Powder:&nbsp;</strong>It is a long established fact that a reader will be distracted by the readable content of a page when looking at its layout. </p>
</div>

Solution

  • You could use Enumerable#find in combination with a regexp match =~ to get the desired element content.

    html.css(".reviewfold p").find { |e| e.text =~ /On Snow Feel/ }.text