Search code examples
javascripthrefgetelementsbyclassname

getElementsByClassName and <a class="asdf" href="url.com">String</a>


I'd like to collect data from a webpage where there are a lot of lines like these?

<a class="asdf" href="http://url.com/jkl/0123/qwer">String</a>

From this line I need the numbers from the url (0123) and the String. I could figure how to get the numbers, but I have problems with the String. I have this code, that collect the numbers in an array:

var titles = document.getElementsByClassName("link-title");
var ids=[];
var tmp;
var i;
for (i=0; i<titles.length; i++) {
    tmp=titles[i].toString().split("/");
    ids.push(tmp[4]);
}

Is it possible to get the Strings from the titles? I'm completely dumb for javascript, though I learned java and a little xml and I could do in java, but the webpage has something DDoS protection, so I can't connect/download it.


Solution

  • The things you get back from getElementsByClassName() are DOM nodes. The .toString() function won't be very useful, but the DOM APIs will let you get the attributes and the node contents:

    for (i=0; i<titles.length; i++) {
        ids.push( titles[i].href );
    }
    

    That would extract the href attributes into your array. (You can still dot that .split() if you want pieces of the URLs of course.) If you wanted the text:

    for (i=0; i<titles.length; i++) {
        ids.push( titles[i].textContent );
    }
    

    To be compatible with Internet Explorer, that'd be:

    for (i=0; i<titles.length; i++) {
        ids.push( titles[i].textContent || titles[i].innerText );
    }