Search code examples
javascripthtmlcheerio

How do I get text after single <br> tag in Cheerio


I'm trying to get some text using Cheerio that is placed after a single <br> tag.

I've already tried the following lines:

let price = $(this).nextUntil('.col.search_price.discounted.responsive_secondrow').find('br').text().trim();
let price = $(this).nextUntil('.col.search_price.discounted.responsive_secondrow.br').text().trim();

Here is the HTML I'm trying to scrape:

<div class="col search_price_discount_combined responsive_secondrow" data-price-final="5039">
  <div class="col search_discount responsive_secondrow">
    <span>-90%</span>
  </div>
  <div class="col search_price discounted responsive_secondrow">
    <span style="color: #888888;"><strike>ARS$ 503,99</strike></span><br>ARS$ 50,39     
  </div>
</div>

I would like to get "ARS$ 50,39".


Solution

  • If you're comfortable assuming this text is the last child element, you can use .contents().last():

    const cheerio = require("cheerio"); // 1.0.0-rc.12
    
    const html = `
    <div class="col search_price_discount_combined responsive_secondrow" data-price-final="5039">
      <div class="col search_discount responsive_secondrow">
        <span>-90%</span>
      </div>
      <div class="col search_price discounted responsive_secondrow">
        <span style="color: #888888;"><strike>ARS$ 503,99</strike></span><br>ARS$ 50,39     
      </div>
    </div>
    `;
    const $ = cheerio.load(html);
    const sel = ".col.search_price.discounted.responsive_secondrow";
    const text = $(sel).contents().last().text().trim();
    console.log(text); // => ARS$ 50,39
    

    If you aren't comfortable with that assumption, you can search through the children to find the first non-empty text node:

    // ...
    const text = $([...$(sel).contents()]
      .find(e => e.type === "text" && $(e).text().trim()))
      .text()
      .trim();
    console.log(text); // => ARS$ 50,39
    

    If it's critical that the text node immediately follows a <br> tag specifically, you can try:

    // ...
    const contents = [...$(sel).contents()];
    const text = $(contents.find((e, i) =>
        e.type === "text" && contents[i-1]?.tagName === "br"
      ))
      .text()
      .trim();
    console.log(text); // => ARS$ 50,39
    

    If you want all of the immediate text children, see: