I am downloading data from a server using urllib2. But I need to determine the IP address of the server to which I am connected.
import urllib2
STD_HEADERS = {'Accept': 'text/html,application/xhtml+xml,application/xml;q=0.9,
*/*;q=0.8',
'Accept-Charset': 'ISO-8859-1,utf-8;q=0.7,*;q=0.7',
'Accept-Language': 'en-us,en;q=0.5',
'User-Agent': 'Mozilla/5.0 (X11; U; Linux x86_64;en-US;rv:1.9.2.12)
Gecko/20101028 Firefox/3.6.12'}
request = urllib2.Request(url, None, STD_HEADERS)
data = urllib2.urlopen(request)
Please don't ask me to find the IP address using the URL as this does not guarantee that the server from which the data is downloaded and the IP address query resolve to the same IP address in case of 'HTTPRedirects' or a loadbalancing server
import urllib2, socket, urlparse
# set up your request as before, then:
data = urllib2.urlopen(request)
addr = socket.gethostbyname(urlparse.urlparse(data.geturl()).hostname)
data.geturl()
returns the URL that was used to actually retrieve the resource, after any redirects. The hostname is then fished out with urlparse
and handed off to socket.gethostbyname
to get the IP address.
Some hosts may have more than one IP address for a given hostname, so it's still possible that the request was fulfilled by a different server, but this is as close as you're gonna get. A gethostbyname
right after the URL request is going to use your DNS cache anyway and unless you're dealing with a time-to-live of, like, 1 second, you're going to be getting the same server you just used.
If this is insufficient, you could spin off a thread and do a lsof
while still connected to the remote server. I'm sure you could convince urllib2
to leave the connection open for a while so this would succeed. This seems like rather more work than it's worth, though.