Search code examples
python-3.xproxyipuser-agent

How to check if my python proxy rotator and user agent spoofing works?


I've code for Proxy IP Rotation and user agent spoofing in order to use in scraping. But because of code was provided as an example, I don't know if it really works when I add it to my code.

I am a beginner in Python. I just add it to my .py file (after the codes that is for scraping). When I add it and start scraping it works and gets all the data but I don't know if it is working or not.

  1. Do I have to create another file for these codes (user agent spoofing and IP rotation)?
  2. And how can I know if these are working or not when I do scraping?
  3. Does it matter if they have defined urls?

Proxy Rotation:

    from lxml.html import fromstring
    import requests
    from itertools import cycle
    import traceback

proxies = ['121.129.127.209:80', '124.41.215.238:45169', '185.93.3.123:8080', '194.182.64.67:3128', '106.0.38.174:8080', '163.172.175.210:3128', '13.92.196.150:8080']
    proxies = get_proxies()
    proxy_pool = cycle(proxies)

url = 'https://httpbin.org/ip'
for i in range(1,11):
    proxy = next(proxy_pool)
    print("Request #%d"%i)
    try:
        response = requests.get(url,proxies={"http": proxy, "https": proxy})
        print(response.json())
    except:
        print("Skipping. Connnection error")

User Agent Spoofing:

    import requests
import random
user_agent_list = [
   #Chrome
    'Mozilla/5.0 (Windows NT 6.3; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/60.0.3112.113 Safari/537.36',
    'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/57.0.2987.133 Safari/537.36',
    'Mozilla/5.0 (Windows NT 6.1; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/57.0.2987.133 Safari/537.36',
    'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/55.0.2883.87 Safari/537.36',
    'Mozilla/5.0 (Windows NT 6.1; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/55.0.2883.87 Safari/537.36',
    #Firefox
    'Mozilla/5.0 (compatible; MSIE 10.0; Windows NT 6.1; WOW64; Trident/6.0)',
    'Mozilla/5.0 (compatible; MSIE 10.0; Windows NT 6.1; Trident/6.0)',
    'Mozilla/4.0 (compatible; MSIE 8.0; Windows NT 5.1; Trident/4.0; .NET CLR 2.0.50727; .NET CLR 3.0.4506.2152; .NET CLR 3.5.30729)'
]
url = 'https://httpbin.org/user-agent'
#Lets make 5 requests and see what user agents are used 

#Using Requests 
for i in range(1,6):
    #Pick a random user agent
    user_agent = random.choice(user_agent_list)
    #Set the headers 
    headers = {'User-Agent': user_agent}
    #Make the request
    response = requests.get(url,headers=headers)

    print("Request #%d\nUser-Agent Sent:%s\nUser Agent Recevied by HTTPBin:"%(i,user_agent))
    print(response.content)
    print("-------------------\n\n")

Solution

  • If you wanted to check if your proxy and user agent are rotating, you need to go to a request bin website, activate an endpoint and use that endpoint within your python code in place of what was previously requested.

    You would then examine the request bin and read what is stated for user-agent and Ip address for the Get requests now listed after executing your python code.