Search code examples
pythonbeautifulsoupinstagraminstagram-api

How to Webscraping Instagram Profile link BeautifulSoup?


I'm just starting to learn how to web scrape using BeautifulSoup and want to write a simple program that will get the profile links (instagram url) of my idol via FullName in Instagram.

Example: I have FullName list stored in file fullname.txt as follow:

#cat fullname.txt
Cristiano Ronaldo
David Beckham
Michael Jackson

My result desire is:

https://www.instagram.com/cristiano/
https://www.instagram.com/davidbeckham/
https://www.instagram.com/michaeljackson/

Can you give me some suggestions?


Solution

  • This worked for all 3 names, and a few others I added to fullname.txt

    It uses the Requests library and a Bing search to find the correct link, then uses regular expressions to parse the link out of the returned packet.

    
    import requests, re
    
    def bingsearch(searchfor):
    
        link = 'https://www.bing.com/search?q={}&ia=web'.format(searchfor)
    
        ua = {'User-Agent': 'Mozilla/5.0 (Macintosh; Intel Mac OS X 10_8_2) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/27.0.1453.116 Safari/537.36'}
    
        payload = {'q': searchfor}
    
        response = requests.get(link, headers=ua, params=payload)               
    
        try:
            found = re.search('Search Results(.+?)</a>', response.text).group(1)
    
            iglink = re.search('a href="(.+?)"', found).group(1)
    
        except AttributeError:
            iglink = "link not found"
    
        return iglink
    
    
    with open("fullname.txt", "r") as f:
        names = f.readlines()
    
    for name in names:
        name = name.strip().replace(" ", "+")
    
        searchterm = name + "+instagram"
    
        IGLink = bingsearch(searchterm)
    
        print(IGLink)