I am using the following code but I am not able to extract any information from the url.
from urllib.parse import urlparse
if __name__ == "__main__":
z = 5
url = 'https://twitter.com/isro/status/1170331318132957184'
df = urlparse(url)
print(df)
ParseResult(scheme='https', netloc='twitter.com', path='/isro/status/1170331318132957184', params='', query='', fragment='')
I was hoping to extract the tweet message, time of tweet and other information available from the link but the code above clearly doesn't achieve that. How do I go about it from here ?
print(df)
ParseResult(scheme='https', netloc='twitter.com', path='/isro/status/1170331318132957184', params='', query='', fragment='')
I think you may be misunderstanding the purpose of the urllib parseurl function. From the Python documentation:
urllib.parse.urlparse(urlstring, scheme='', allow_fragments=True)
Parse a URL into six components, returning a 6-item named tuple. This corresponds to the general structure of a URL: scheme://netloc/path;parameters?query#fragment
From the result you are seeing in ParseResult, your code is working perfectly - it is breaking your URL up into the component parts.
It sounds as though you actually want to fetch the web content at that URL. In that case, I might take a look at urllib.request.urlopen instead.