Search code examples
disqus

How to get all the comments from Disqus?


I want to fetch all the comments on CNN whose comment system is Disqus. As an example, http://edition.cnn.com/2013/02/25/tech/innovation/google-glass-privacy-andrew-keen/index.html?hpt=hp_c1

The commenting system requires us to click on "load more" so that we can see more comments. I have tried using php to parse the html but it was not able to load all the comments since the javascript is used. So i am wondering if anyone has a more convenient way to retrieve all the comments from a specific cnn url.

Has anyone made it successfully? Thanks in advance


Solution

  • The Disqus API contains a pagination method using cursors that are returned in the JSON response. See here for information about cursors: http://disqus.com/api/docs/cursors/

    Since you mentioned PHP, something like this should get you started:

    <?php
    $apikey = '<your key here>'; // get keys at http://disqus.com/api/ — can be public or secret for this endpoint
    $shortname = '<the disqus forum shortname>'; // defined in the var disqus_shortname = '...';
    $thread = 'link:<URL of thread>'; // IMPORTANT the URL that you're viewing isn't necessarily the one stored with the thread of comments
    //$thread = 'ident:<identifier of thread>'; Use this if 'link:' has no results. Defined in 'var disqus_identifier = '...';
    $limit = '100'; // max is 100 for this endpoint. 25 is default
    
    $endpoint = 'https://disqus.com/api/3.0/threads/listPosts.json?api_key='.$apikey.'&forum='.$shortname.'&limit='.$limit.'&cursor='.$cursor;
    
    $j=0;
    listcomments($endpoint,$cursor,$j);
    
    function listcomments($endpoint,$cursor,$j) {
    
        // Standard CURL
        $session = curl_init($endpoint.$cursor);
        curl_setopt($session, CURLOPT_RETURNTRANSFER, 1); // instead of just returning true on success, return the result on success
        $data = curl_exec($session);
        curl_close($session);
    
        // Decode JSON data
        $results = json_decode($data);
        if ($results === NULL) die('Error parsing json');
    
        // Comment response
        $comments = $results->response;
    
        // Cursor for pagination
        $cursor = $results->cursor;
    
        $i=0;
        foreach ($comments as $comment) {
            $name = $comment->author->name;
            $comment = $comment->message;
            $created = $comment->createdAt;
            // Get more data...
    
            echo "<p>".$name." wrote:<br/>";
            echo $comment."<br/>";
            echo $created."</p>";
            $i++;
        }
    
        // cursor through until today
        if ($i == 100) {
            $cursor = $cursor->next;
            $i = 0;
            listcomments($endpoint,$cursor);
            /* uncomment to only run $j number of iterations
            $j++;
            if ($j < 10) {
                listcomments($endpoint,$cursor,$j);
            }*/
        }
    }
    
    ?>