Search code examples
javascriptjqueryapify

Get all the texts in h2 and return an array readable (via APIFY)


I'm using "Web scraper apify" to scrap some data from a website. The goal is to get all the texts in H2 and return an array of them. My problem is when I returned the array. This one is not correct and not usable because it separates all the letters of the different scrapped texts.

I tried to write this code (javascript and jquery including):

function pageFunction() {
  const results = []

  $('h2').map(function() {
    results.push($(this).text());
  });

  return results;
}

console.log(pageFunction());
<h2>Heading One</h2>
<h2>Heading Two</h2>
<h2>Heading Three</h2>

<script src="https://cdnjs.cloudflare.com/ajax/libs/jquery/3.3.1/jquery.min.js"></script>

And I have this result when I export in JSON

[{
  "0": "M",
  "1": "u",
  "2": "t",
  "3": "i",
  "4": "n",
  "5": "y"
},
{
  "0": "G",
  "1": "r",
  "2": "o",
  "3": "w",
  "4": "S",
  "5": "u",
  "6": "m",
  "7": "o"
},
{
  "0": "C",
  "1": "u",
  "2": "s",
  "3": "t",
  "4": "o",
  "5": "m",
  "6": "e",
  "7": "r",
  "8": ".",
  "9": "i",
  "10": "o"
}]

I would like something like


[{
"tool": "Mutiny"
},
{
"tool": "Growsumo"
},
{
"tool":"customer.io"
}]

Solution

  • Regardless of the cause of the string being split to a character array, to get the desired output, change the map so that it returns the object that you want, rather than just the text:

    .map((i, e) => $(e).text())
    

    gives you an array of the text, but you want an array of { tool: <text> }, so:

    .map((i, e) => { return { tool : $(e).text() } })
    

    (note: jquery .map() requires .get() at the end to convert to a "true" array)

    You'll then have an array of objects, which you can json stringify to get your desired JSON output:

    let results = $('h2').map((i, e) => { 
        return { tool: $(e).text() }
    }).get();
    
    //console.log(results);
    
    let json = JSON.stringify(results)
    
    console.log(json);
    <script src="https://cdnjs.cloudflare.com/ajax/libs/jquery/3.3.1/jquery.min.js"></script>
    
    <h2>Mutiny</h2>
    <h2>Growsumo</h2>
    <h2>Customer.io</h2>