Search code examples
pythonselenium-chromedriverpython-multiprocessingnameerror

Python, NameError: name browser is not defined


I know there are tons of questions like this one, I tried to read them all. What I'm doing is to use the multiprocessing library to parse web pages via Python Selenium. So, I have 3 lists to give to a function that processes them. First I write the function, then initiate the browser istance and lastly start the 3 processes.

import ...

def parsing_pages(list_with_pages_to_parse):
    global browser
    #do stuff

if __name__ == '__main__':
    browser = webdriver.Chrome(..., options = ...)
    browser.get(...)

    lists_with_pages_to_parse = [ [...], [...], [...] ]
    
    pool.mp.Pool(3)
    pool.map(parsing_pages, lists_with_pages_to_parse)
    pool.close
    pool.join

The error:

NameError: name 'browser' is not defined

Traceback (most recent call last):
  File "c:\Users\39338\Desktop\program.py", line 323, in <module>
    pool.map(parsing_pages, lists_with_pages_to_parse)
  File "C:\Users\39338\AppData\Local\Programs\Python\Python310\lib\multiprocessing\pool.py", line 364, in map
    return self._map_async(func, iterable, mapstar, chunksize).get()
  File "C:\Users\39338\AppData\Local\Programs\Python\Python310\lib\multiprocessing\pool.py", line 771, in get
    raise self._value
NameError: name 'browser' is not defined

I used global to allow "browser" to be used inside the function. I thought the problem was that the function is written before I create "browser", but when I try to put it after the main part, I get the error that the function cannot be found when called.


Solution

  • Calling this function when __name__ != '__main__' (from another: file, thread or process) will never initialize browser. Example:

    def f():
        global browser
        browser
    
    if __name__ == '__main__':
        browser = None
    
    # Calling f will not raise an error
    f()
    
    def f():
        global browser
        browser
    
    if __name__ != '__main__':
        browser = None
    
    # Calling f will will now raise an error
    f()
    

    I think what's happening is you are making a pool and the pool runs parsing_pages() from another process where __name__ != '__main__'.


    You need to do one of the following:

    • Pass browser into your function as an argument
    • Initialize browser outside of the if statement

    You should add print(__name__) to check what it equals. It will probably return the name of your file, rather than __main__.


    Edit after problem was solved:

    __name__ will equal '__main__' when you are running the file without: threads, processing pools or from another file. i.e. when you run it by itself. As this was running in a multiprocessing pool, it was not going to satisfy __name__ == '__main__'. So the conditional would never allow for browser to be initialized.

    This is discussed in much more detail below:

    A video for easy digestion (in Python2 but that's fine)

    Python Tutorial: if __name__ == '__main__' (Youtube | 8:42)

    Most detailed articles (Stack Overflow)

    What does if __name__ == "__main__": do?

    Purpose of 'if __name__ == "__main__":'

    And if you're interested

    What's the point of a main function and/or __name__ == "__main__" check in Python?