I've been working on a script to scrap replies of a single tweet on a log
now, i haven't work on this all by myself, and finally make it almost work but i found a index error almost at the end, a "List index out of range"
i'm a little bit confuse 'cause i don't see what's the problem here... can somebody explain? ._.
def tweet_url(t):
return "https://twitter.com/%s/status/%s" % (t.user.screen_name, t.id)
def get_tweets(filename):
for line in open(filename):
yield twitter.Status.NewFromJsonDict(json.loads(line))
def get_replies(tweet):
user = tweet.user.screen_name
tweet_id = tweet.id
max_id = None
logging.info("looking for replies to: %s" % tweet_url(tweet))
while True:
q = urllib.parse.urlencode({"q": "to:%s" % user})
try:
replies = t.GetSearch(raw_query=q, since_id=tweet_id, max_id=max_id, count=100)
except twitter.error.TwitterError as e:
logging.error("caught twitter api error: %s", e)
time.sleep(60)
continue
for reply in replies:
logging.info("examining: %s" % tweet_url(reply))
if reply.in_reply_to_status_id == tweet_id:
logging.info("found reply: %s" % tweet_url(reply))
yield reply
# recursive magic to also get the replies to this reply
for reply_to_reply in get_replies(reply):
yield reply_to_reply
max_id = reply.id
if len(replies) != 100:
break
if __name__ == "__main__":
logging.basicConfig(filename="replies.log", level=logging.INFO)
tweets_file = sys.argv[1]
for tweet in get_tweets(tweets_file):
for reply in get_replies(tweet):
print(reply.AsJsonString())
So... is on the bottom line, the list (sys.argv [1]) is causing the problem here but i don't see why the out of range index error appear, any idea?
From the python official docs -
The list of command line arguments passed to a Python script. argv[0] is the script name (it is operating system dependent whether this is a full pathname or not).
If I were to read this, I would read till this point -
The list of command line arguments passed to a Python script
That means, sys.argv
is a list, and when you try to access something from a list that does not exist in it(by index), it gives you an IndexError
. You need to call your script with arguments that it needs, and those arguments will be accessed from sys.argv[1]
For example -
python file_name.py some_argument
And some_argument
will be accessible from sys.argv[1]
. You could test if arguments have been passed to the script by using try
or using len
on argv like -
try:
args = sys.argv[1]
except IndexError:
print('No argument passed')
Or -
if len(sys.argv) > 1:
args = sys.argv[1]