My main objective is, through cheerio, to make a scrapping of the titles of this imdb ranking
https://www.imdb.com/chart/tvmeter/?ref_=nv_tvv_mptv
However, following cheerio's documentation and placing the exact html path of the listed titles, I am still returned random and confusing objects, like:
'x-attribsNamespace': [Object: null prototype] {},
'x-attribsPrefix': [Object: null prototype] {}
},
'80': <ref *81> Element {
parent: Element {
parent: [Element],
prev: [Text],
next: [Text],
startIndex: null,
endIndex: null,
children: [Array],
name: 'tbody',
attribs: [Object: null prototype],
type: 'tag',
namespace: 'http://www.w3.org/1999/xhtml',
'x-attribsNamespace': [Object: null prototype],
'x-attribsPrefix': [Object: null prototype]
},
prev: Text {
parent: [Element],
prev: [Element],
next: [Circular *81],
startIndex: null,
endIndex: null,
data: '\n\n ',
type: 'text'
},
next: Text {
parent: [Element],
prev: [Circular *81],
next: [Element],
startIndex: null,
endIndex: null,
data: '\n\n ',
type: 'text'
},
startIndex: null,
endIndex: null,
children: [
[Text], [Element],
[Text], [Element],
[Text], [Element],
[Text], [Element],
[Text], [Element],
[Text]
],
name: 'tr',
attribs: [Object: null prototype] {},
type: 'tag',
namespace: 'http://www.w3.org/1999/xhtml',
'x-attribsNamespace': [Object: null prototype] {},
'x-attribsPrefix': [Object: null prototype] {}
},
'81': <ref *82> Element {
parent: Element {
parent: [Element],
prev: [Text],
next: [Text],
startIndex: null,
endIndex: null,
children: [Array],
name: 'tbody',
attribs: [Object: null prototype],
type: 'tag',
namespace: 'http://www.w3.org/1999/xhtml',
'x-attribsNamespace': [Object: null prototype],
'x-attribsPrefix': [Object: null prototype]
},
prev: Text {
parent: [Element],
prev: [Element],
next: [Circular *82],
startIndex: null,
endIndex: null,
data: '\n\n ',
type: 'text'
},
next: Text {
parent: [Element],
prev: [Circular *82],
next: [Element],
startIndex: null,
endIndex: null,
data: '\n\n ',
type: 'text'
},
startIndex: null,
endIndex: null,
children: [
[Text], [Element],
[Text], [Element],
[Text], [Element],
[Text], [Element],
[Text], [Element],
[Text]
],
name: 'tr',
attribs: [Object: null prototype] {},
type: 'tag',
namespace: 'http://www.w3.org/1999/xhtml',
'x-attribsNamespace': [Object: null prototype] {},
'x-attribsPrefix': [Object: null prototype] {}
},
'82': <ref *83> Element {
parent: Element {
parent: [Element],
prev: [Text],
next: [Text],
startIndex: null,
endIndex: null,
children: [Array],
name: 'tbody',
attribs: [Object: null prototype],
type: 'tag',
namespace: 'http://www.w3.org/1999/xhtml',
'x-attribsNamespace': [Object: null prototype],
'x-attribsPrefix': [Object: null prototype]
},
prev: Text {
parent: [Element],
prev: [Element],
next: [Circular *83],
startIndex: null,
endIndex: null,
data: '\n\n ',
type: 'text'
},
next: Text {
parent: [Element],
prev: [Circular *83],
next: [Element],
startIndex: null,
endIndex: null,
data: '\n\n ',
type: 'text'
},
startIndex: null,
endIndex: null,
children: [
[Text], [Element],
[Text], [Element],
[Text], [Element],
[Text], [Element],
[Text], [Element],
[Text]
],
name: 'tr',
attribs: [Object: null prototype] {},
type: 'tag',
namespace: 'http://www.w3.org/1999/xhtml',
'x-attribsNamespace': [Object: null prototype] {},
'x-attribsPrefix': [Object: null prototype] {}
},
code:
import * as cheerio from 'cheerio';
import axios from 'axios';
import fs from 'fs';
axios("https://www.imdb.com/chart/tvmeter/?ref_=nv_tvv_mptv").then(res => {
const data = res.data;
const $ = cheerio.load(data);
var cheerioData = $('.lister-list>tr').each((i, e) => {
const title = $(e).find('.titleColumn a').text();
console.log(title);
})
console.log(cheerioData);
})
I really don't understand what is being done wrong as the path is completely correct. can anybody help me?
You can convert the array of Cheerio objects to an array of text using map
followed by a spread, a .get()
or a .toArray()
.
For example, with spread and vanilla JS Array#map
:
import axios from "axios";
import cheerio from "cheerio";
const url = "<Your URL>";
axios(url).then(res => {
const $ = cheerio.load(res.data);
const text = [...$(".lister-list > tr")].map(e =>
$(e).find(".titleColumn a").text().trim()
);
console.log(text);
})
Also possible, using .get()
or .toArray()
after a Cheerio .map
(which has the index as the first argument):
const text = $(".lister-list > tr").map((i, e) =>
$(e).find(".titleColumn a").text().trim()
).get();
If you want to use .each
, you can .push()
each text string onto a vanilla array, but this isn't as clean as .map
, which exists to abstract away this pattern:
const text = [];
$(".lister-list > tr").each((i, e) => {
text.push($(e).find(".titleColumn a").text().trim());
});