Search code examples

Fetch multiple, external URLs with GM_xmlhttpRequest, add page <H1> to links?

SOLVED thanks to Hellion's help!

Here is the code:

// ==UserScript==
// @name          Facebook Comment Moderation Links
// @description   Appends story titles to Facebook Comment Moderation "Visit Website" links
// @include       http*://*
// ==/UserScript==

var allLinks, thisLink, expr, pageTitle, myURL, myPage, pageContent, title;

// grabbing URLs
function fetchPage(myPage, targetLink) {
            method: 'GET',
            url: myPage,
            onload: function(response){

                // get the HTML content of the page
                pageContent = response.responseText;

                // use regex to extract its h1 tag
                pageTitle = pageContent.match(/<h1.*?>(.*?)<\/h1>/g)[0];

                // strip html tags from the result
                pageTitle = pageTitle.replace(/<.*?>/g, '');

                // append headline to Visit Website link
                title = document.createElement('div');
       = "yellow";
       = "#000";
                targetLink.parentNode.insertBefore(title, targetLink.nextSibling);  


function processLinks() {

    // define which links to look for
    expr = "//a[contains (string(), 'Visit Website')]";
    allLinks = document.evaluate(

    // loop through the links
    for (var i = 0; i < allLinks.snapshotLength; i++) {
        thisLink = allLinks.snapshotItem(i);    
        myURL = thisLink.getAttribute('href');

        // follow Visit Website link and attach corresponding headline
        fetchPage(myURL, thisLink);

// get the ball rolling


I am trying to make a Greasemonkey script that fetches the URL from each of a set of links and appends the contents of the page's h1 tag to the end of the link.

So far, I can get it to show the URL itself, which doesn't require a page request, but not the page's h1 tag contents, which does.

I understand from other questions on this site that GM_xmlhttpRequest is asynchronous and I am pretty sure this is at least part of the cause. However I cannot find the solution to this specific problem.

Below is the code I have so far. It is for Facebook's website comment moderation tool -- in the Moderator View, each comment has a link, "Visit Website," that takes you to the article the comment is on.

As it is written right now, it would append the HTTP status code, not the page title, and then the URL to each "Visit Website" link. The status code part is just a placeholder. I plan on adding the HTML parsing, etc. to get the h1 tag later.

Right now I am just trying to get the GM_xmlhttpRequest and the content insertion to match up.

Any help is sorting this out would be greatly appreciated. Thank you!

var allLinks, thisLink, expr, pageTitle, myURL, pageContent, title;

// define which links to process
    expr = "//a[contains (string(), 'Visit Website')]";
    allLinks = document.evaluate(

// cycle through links
for (var i = 0; i < allLinks.snapshotLength; i++) {

    thisLink = allLinks.snapshotItem(i);    
    myURL = thisLink.getAttribute('href');

        method: 'GET',
        url: myURL,
        onload: function(responseDetails){

            pageTitle = responseDetails.status;


    // append info to end of each link 
    title = document.createElement('div'); = "yellow"; = "#000";
        ' [' + pageTitle + ' - ' + thisLink.getAttribute('href') + ']'));
    thisLink.parentNode.insertBefore(title, thisLink.nextSibling);  



  • As it's written, yes, you suffer from the asynchronous nature of the GM_xmlhttpRequest() call. The loop will fire off and start fetching all the pageTitle values, but will immediately continue on, not waiting for the requests to complete, and so pageTitle (which you didn't declare anywhere, by the way) is null when you use it for the textNode.

    The first step you need to take to rectify the situation is to move all of the stuff that currently follows the GM_xmlhttpRequest() call to the inside of the onload: function() definition. Then, only after each page has been retrieved will you continue on with modifying your links. (There may be other issues with needing to pass in or reacquire the thislink value too, I'm not sure.)