Search code examples
pythonrecursionfunctional-programmingredditpraw

Retrieving a threaded comment list recursively


I am trying to write a recursive function that can retrieve the nested comments from a Reddit submission. I am using Python + PRAW

def _get_comments(comments, ret = []):
    for comment in comments:
        if len(comment._replies) > 0:
            return _get_comments(tail(comments), ret + [{
                #"body": comment.body,
                "id": comment.id,
                "author": str(comment.author),
                "replies": map(lambda replies: _get_comments(replies, []), [comment._replies])
                }])
        else:
            return ret + [{
                    #"body": comment.body,
                    "id": comment.id,
                    "author": str(comment.author)
                }]
    return ret

def tail(list):
    return list[1:len(list)]

And I get the following output, which is incomplete and has nested arrays:

pprint(_get_comments(s.comments))
[{'author': 'wheremydirigiblesat',
  'id': u'ctuzo4x',
  'replies': [[{'author': 'rhascal',
                'id': u'ctvd6jw',
                'replies': [[{'author': 'xeltius', 'id': u'ctvx1vq'}]]}]]},
 {'author': 'DemiDualism',
  'id': u'ctv54qs',
  'replies': [[{'author': 'rhascal',
                'id': u'ctv5pm1',
                'replies': [[{'author': 'blakeb43', 'id': u'ctvdb9c'}]]}]]},
 {'author': 'Final7C', 'id': u'ctvao9j'}]

The Submission object has a comments attribute which is a list of Comment objects. Each Comment object has a _replies attribute which is a list of more Comments.

What am I missing? I gave it my best shot -- recursion is hard.


Solution

  • You got it almost correctly. The problem is that you're trying to make recursion as something complex, when it's simple. You don't need tail() function as well as map() function inside, since you're already iterating through comments.

    I renamed your function in examples, since it converts comments to dicts actually.

    Let's start from simple case, think about it like: "okey, I want to have a function, which is able to convert list of comments to list of dicts". Just simple function:

    def comments_to_dicts(comments):
        results = []  # create list for results
        for comment in comments:  # iterate over comments
            item = {
                "id": comment.id,
                "author": comment.author,
            }  # create dict from comment
    
            results.append(item)  # add converted item to results 
        return results  # return all converted comments
    

    And now you want dict to also include list of replies converted to dicts. And you already have function, which is able to do this conversion, so let's just use it and put result into item['replies']:

    def comments_to_dicts(comments):
        results = []  # create list for results
        for comment in comments:  # iterate over comments
            item = {
                "id": comment.id,
                "author": comment.author,
            }  # create dict from comment
    
            if len(comment._replies) > 0:
                item["replies"] = comments_to_dicts(comment._replies)  # convert replies using the same function
    
            results.append(item)  # add converted item to results 
        return results  # return all converted comments
    

    Since you modified the same function you call, it will convert all replies, no matter how deep they are. Hope it's more clear how recursion works.