Search code examples
pythonbeautifulsouphttplibgetresponse

python http page without heads


i have to search for a specific url, through a large number of ips. I've written a script in python that checks if port its open, and then checks if the url exist using httplib, and it's working great! My problem is that i've been getting too many false-positives, because some net devices, give status 200 when ask for my page, and returns a page with the 400 error on the body

Here its my code:

def MyPage(self,ip):
    try:
        conn = httplib.HTTPConnection(ip)
        conn.request("HEAD", "/path/to/mypage.php")
        resp = conn.getresponse()
        if (resp.status == 200):
            return True
        else :
            return False
    except :
        return False

Solution

  • I solved my problem checking for title tag on the body of the page

    def Mypage(self,ip):
        try:
            conn = httplib.HTTPConnection(ip)
            conn.request("GET", "/path/to/mypage.php")
            resp = conn.getresponse()
            if (resp.status == 200):
                html = BeautifulSoup(resp.read())
                data = html.find('title')
                titulo = str(data.contents[0])
                if titulo == "THE TITLE":
                    return True
                else:
                    return False
            else :
                return False
        except :
            return False