Search code examples
javascriptarraysgoogle-apps-scriptgoogle-sheetscheerio

How to protect the positions to place the data in the worksheet when some values are not retrieved?


I'm using the CheerioGS library:

ID → 1ReeQ6WO8kKNxoaA_O0XEQ589cIrRvEBA9qcWpNqdOP17i47u6N9M5Xh0
Project → https://github.com/tani/cheeriogs

My full code (I added it in its entirety so that they can use it in tests):

function importdataweb() {
    const sheet = SpreadsheetApp.getActive().getSheetByName('Page_One');
    sheet.getRange('A1:Z' + sheet.getMaxRows()).clear({contentsOnly: true, skipFilteredRows: true});
    const url = 'http://www.futebolnatv.com.br/jogos-hoje/';
    const contentText = UrlFetchApp.fetch(url).getContentText();
    const $ = Cheerio.load(contentText);
  
    let elements = $('tbody > tr > th > h4 > b')
    elements.each((index, value) => {
      sheet.getRange(index+1, 1).setValue($(value).text().trim())
    })
    
    let elements_1 = $('tbody > tr > td > div:nth-child(1)')
    elements_1.each((index, value) => {
      sheet.getRange(index+1, 2).setValue($(value).text().trim())
    })
    
    let elements_2 = $('tbody > tr > td > div:nth-child(2)')
    elements_2.each((index, value) => {
      sheet.getRange(index+1, 3).setValue($(value).contents().last().text().trim())
    })
    
    let elements_3 = $('tbody > tr > td > div:nth-child(3)')
    elements_3.each((index, value) => {
      sheet.getRange(index+1, 4).setValue($(value).contents().last().text().trim())
    })
    
    let elements_4 = $('tbody > tr > td b:nth-child(2)')
    elements_4.each((index, value) => {
      sheet.getRange(index+1, 5).setValue($(value).text().trim())
    })
}

The result is:
enter image description here

Some values in the last column for some reason are not collected, so the position of some of these values in the worksheet is in the wrong place.

For example: If the first value is not collected but the second is, the second will be in row 1 of the column, then it would be wrong.

How could I add a dash - in the lines that the value is not collected?


Solution

  • let data = [];
    
    let rows = $('tbody > tr');
    rows.each((i, row) => {
      const newDataRow = [
        $(row).find('th > h4 > b').text().trim(),
        $(row).find('td > div:nth-child(1)').text().trim(),
        $(row).find('td > div:nth-child(2)').contents().last().text().trim(),
        $(row).find('td > div:nth-child(3)').contents().last().text().trim(),
        $(row).find('td b:nth-child(2)').text().trim(),
    
      ];
      data.push(newDataRow);
    });
    
    // fill empty values
    data = data.map(row => row.map(el => el ? el : '-'));
    
    
    // write values to sheet
    const startRow = 1;
    const startCol = 1;
    const numRows = data.length;
    const numCols = data[0].length;
    
    sheet.getRange(startRow, startCol, numRows, numCols).setValues(data);
    

    Changes made for part after you load Cheerio object:

    1. You need to collect all data, and only then write it (setValues) back to sheet in 1 request, to reduce script execution time and improve performance.
    2. You need to loop rows (let rows = $('tbody > tr')) with Cheerio, to be able to combine values you need for each row so it can be displayed as a row in a spreadsheet.

    Second change is more important, as it gives you ability to work with your data per row basis later, like sort, filter etc.