Search code examples
pythonpython-3.xfunctionweb-scrapingreturn

Trouble understanding the difference between passing result to another function and retuning result to another function


I've written a script in python using two functions within it. The first function is supposed to get some links from a webpage and the other should print it in the console.

My question is what difference does it make when I pass the result from one function to another function using return keyword like return get_info(elem)? Usually doing only this get_info(elem), I can pass stuffs from one function to another then when to choose this return get_info(elem) and why?

An example might be:

import requests
from bs4 import BeautifulSoup

def get_links(url):
    response = requests.get(url)
    soup = BeautifulSoup(response.text,"lxml")
    elem = soup.select_one(".info h2 a[data-analytics]").get("href")
    get_info(elem)  #why this one
    return get_info(elem) #or why this

def get_info(link):
    print(link)

Solution

  • Let us first simplify your function so that you can run it and compare the results:

    def get_links(url):
        url = "this returns link: {}".format(url)
        get_info(url)  #why this one
        return get_info(url) #or why this
    
    def get_info(link):
        print(link)
    
    get_links('google.com')
    >>this returns link: google.com
    >>this returns link: google.com
    

    Your function now returns print twice. First when you called the function, and second when you returned the function, and in this case actually returns None because get_info does not return anything.

    This is evident here:

    url = get_links('google.com')
    >>this returns link: google.com
    >>this returns link: google.com
    
    url
    >> *nothing happens* 
    

    The results of return are more apparent if it actually does something, for example:

    def get_links(url):
        url = "this returns link: {}".format(url)
        return get_info(url)
    
    def get_info(link):
        return "get_info does something, {}".format(link)
    
    url = get_links('google.com')
    url
    
    >>'get_info does something, this returns link: google.com'
    

    If you do not use return, it just means the function will not return anything, which happens for example if you just want to print the results as you did. You can further try this out by assigning a name like I did above to a function does has no return and the result will essentially be None.