Search code examples
pythonlisttwittertweepy

Python - list index out of range? Twitter Replies


I've been working on a script to scrap replies of a single tweet on a log

now, i haven't work on this all by myself, and finally make it almost work but i found a index error almost at the end, a "List index out of range"

i'm a little bit confuse 'cause i don't see what's the problem here... can somebody explain? ._.

def tweet_url(t):
    return "https://twitter.com/%s/status/%s" % (t.user.screen_name, t.id)

def get_tweets(filename):
    for line in open(filename):
        yield twitter.Status.NewFromJsonDict(json.loads(line))

def get_replies(tweet):
    user = tweet.user.screen_name
    tweet_id = tweet.id
    max_id = None
    logging.info("looking for replies to: %s" % tweet_url(tweet))
    while True:
        q = urllib.parse.urlencode({"q": "to:%s" % user})
        try:
            replies = t.GetSearch(raw_query=q, since_id=tweet_id, max_id=max_id, count=100)
        except twitter.error.TwitterError as e:
            logging.error("caught twitter api error: %s", e)
            time.sleep(60)
            continue
        for reply in replies:
            logging.info("examining: %s" % tweet_url(reply))
            if reply.in_reply_to_status_id == tweet_id:
                logging.info("found reply: %s" % tweet_url(reply))
                yield reply
                # recursive magic to also get the replies to this reply
                for reply_to_reply in get_replies(reply):
                    yield reply_to_reply
            max_id = reply.id
        if len(replies) != 100:
            break

if __name__ == "__main__":
    logging.basicConfig(filename="replies.log", level=logging.INFO)
    tweets_file = sys.argv[1] 
    for tweet in get_tweets(tweets_file):
        for reply in get_replies(tweet):
            print(reply.AsJsonString())

So... is on the bottom line, the list (sys.argv [1]) is causing the problem here but i don't see why the out of range index error appear, any idea?


Solution

  • From the python official docs -

    The list of command line arguments passed to a Python script. argv[0] is the script name (it is operating system dependent whether this is a full pathname or not).

    If I were to read this, I would read till this point -

    The list of command line arguments passed to a Python script

    That means, sys.argv is a list, and when you try to access something from a list that does not exist in it(by index), it gives you an IndexError. You need to call your script with arguments that it needs, and those arguments will be accessed from sys.argv[1]

    For example -

    python file_name.py some_argument
    

    And some_argument will be accessible from sys.argv[1]. You could test if arguments have been passed to the script by using try or using len on argv like -

    try:
        args = sys.argv[1]
    except IndexError:
        print('No argument passed')
    

    Or -

    if len(sys.argv) > 1:
        args = sys.argv[1]