I've created a number of JS scripts similar to below which generate a .tsv download of the webscraped data (this particular example assumes you're on the URL of a repo's Contributors page on Gitlab). Everything outputs fine when I open the .tsv in Microsoft Excel, except that the string 'undefined' appears prepended to every value after the header row in the first column only
How do I edit the script to omit undefined from appearing? Even if it's a simple fix, it will allow me to clean up a bunch of scripts' similar output scraping other websites.
javascript:(function(){
var arr = new Array, i, commitsemail, commitsnum, email, fullname, matchnum;
var regexnum = /.+?(?=commits)/g; var regexemail = /(?<=\().*(?=\))/g;
var glab = document.querySelectorAll('div.col-lg-6.col-12.gl-my-5');
var strings='Full name'+'\t'+'Email'+'\t'+'# of commits'+'\r\n';
var endurl = document.URL.split(/[/]+/).pop(); if (endurl != 'master') {
alert('You are not on the contributors page of a Gitlab repo. Press Esc key, go to URL ending with /master and re-run this bookmarklet'); } else {
for (i = 0; i<glab.length; i++) {
fullname = glab[i].getElementsByTagName('h4')[0].textContent;
commitsemail = glab[i].getElementsByTagName('p')[0].textContent;
commitsnum = [...commitsemail.match(regexnum)];
email = [...commitsemail.match(regexemail)];
arr[i] += fullname + '\t' + email + '\t' + commitsnum;
strings += arr[i]+'\r\n'; }
var pom = document.createElement('a');
var csvContent = strings; var blob = new Blob([csvContent],{type: 'text/tsv;charset=utf-8;'});
var url = URL.createObjectURL(blob); pom.href = url; pom.setAttribute('download','gitlab-contributors.tsv'); pom.click(); }
})();
It's because of the +=
on the line with arr[i] += fullname + '\t' + email + '\t' + commitsnum;
. Change that to an =
instead.
Before the assignment, arr[i]
is undefined. Maybe you mixed up the syntax for assigning an array entry by index, with appending to an array (arr.push(...)
), thinking +=
would push, but it doesn't. It appends the new value to the current value. And since that line is the first time arr[i]
is assigned anything, the current value is undefined.