I know there are tons of questions like this one, I tried to read them all. What I'm doing is to use the multiprocessing library to parse web pages via Python Selenium. So, I have 3 lists to give to a function that processes them. First I write the function, then initiate the browser istance and lastly start the 3 processes.
import ...
def parsing_pages(list_with_pages_to_parse):
global browser
#do stuff
if __name__ == '__main__':
browser = webdriver.Chrome(..., options = ...)
browser.get(...)
lists_with_pages_to_parse = [ [...], [...], [...] ]
pool.mp.Pool(3)
pool.map(parsing_pages, lists_with_pages_to_parse)
pool.close
pool.join
The error:
NameError: name 'browser' is not defined
Traceback (most recent call last):
File "c:\Users\39338\Desktop\program.py", line 323, in <module>
pool.map(parsing_pages, lists_with_pages_to_parse)
File "C:\Users\39338\AppData\Local\Programs\Python\Python310\lib\multiprocessing\pool.py", line 364, in map
return self._map_async(func, iterable, mapstar, chunksize).get()
File "C:\Users\39338\AppData\Local\Programs\Python\Python310\lib\multiprocessing\pool.py", line 771, in get
raise self._value
NameError: name 'browser' is not defined
I used global to allow "browser" to be used inside the function. I thought the problem was that the function is written before I create "browser", but when I try to put it after the main part, I get the error that the function cannot be found when called.
Calling this function when __name__ != '__main__'
(from another: file, thread or process) will never initialize browser
. Example:
def f():
global browser
browser
if __name__ == '__main__':
browser = None
# Calling f will not raise an error
f()
def f():
global browser
browser
if __name__ != '__main__':
browser = None
# Calling f will will now raise an error
f()
I think what's happening is you are making a pool
and the pool runs parsing_pages()
from another process where __name__ != '__main__'
.
You need to do one of the following:
browser
into your function as an argumentif
statementYou should add print(__name__)
to check what it equals. It will probably return the name of your file, rather than __main__
.
Edit after problem was solved:
__name__
will equal '__main__'
when you are running the file without: threads, processing pools or from another file. i.e. when you run it by itself. As this was running in a multiprocessing pool, it was not going to satisfy __name__ == '__main__'
. So the conditional would never allow for browser
to be initialized.
This is discussed in much more detail below:
A video for easy digestion (in
Python2
but that's fine)Python Tutorial: if __name__ == '__main__' (Youtube | 8:42)
Most detailed articles (Stack Overflow)
What does if __name__ == "__main__": do?
Purpose of 'if __name__ == "__main__":'
And if you're interested
What's the point of a main function and/or __name__ == "__main__" check in Python?