I had this question asked earlier [inactive and now deleted] but didn't phrase it right. And I'm trying to improve on that.
Any help would be greatly appreciated.
WHAT I'M TRYING TO DO: Automate certain tasks [with selenium] (Restful API)
WHAT I HAVE RUNNING:
class SomeClass(SomeOtherClass):
def do_tasks(self, selections):
booking_code = None
task_done = 0
driver = self.connect() #spawns a chrome browser
#I want the below for loop to run in parallel
for task in tasks:
try:
#check_if_task_is_in_search_result_&_then_open_in_new_tab
#do_something
task_done += 1
#close_tab
except:
#handle_something
driver.close()
driver.switch_to.window(driver.window_handles[0])
driver.refresh()
try:
check = WebDriverWait(driver, 10).until(EC.element_to_be_clickable((By.CLASS_NAME, 'xxxx'))).click()
except NoSuchElementException as e:
log_error(str(e))
except TimeoutException as e:
log_error(str(e))
else:
booking_code = str(driver.find_element_by_class_name("number").text).split(':')[1]
driver.quit()
return task_done, booking_code
This is sequential and takes roughly 5 mins for 5 tasks.
WHAT I'VE TRIED SO FAR - bring out the for-loop section to a new method - do_task
.
Import: from joblib import Parallel, delayed
class SomeClass(SomeOtherClass):
def do_task(self, task):
driver = self.connect() #spawns a chrome browser
try:
#do_something
task_done += 1
except:
#handle_something
return task_done, driver
def get_booking_code(self, driver):
try:
check = WebDriverWait(driver, 10).until(EC.element_to_be_clickable((By.CLASS_NAME, 'xxxx'))).click()
except NoSuchElementException as e:
log_error(str(e))
else:
booking_code = str(driver.find_element_by_class_name("number").text).split(':')[1]
driver.quit()
return booking_code
if __name__ == '__main__':
tasks = [
['task1'],
['task2']
]
b = SomeClass(site='https://somesite.com/') #chrome connects to this via the self.connect()
completed_tasks, driver = Parallel(n_jobs=-1)(delayed(b.do_task)(task) for task in tasks)
booking_code = b.get_booking_code(driver)
print(completed_tasks, booking_code)
It doesn't run. It spawns a blank chrome browser and closes immediately.
Traceback as below:
completed_tasks, driver = Parallel(n_jobs=-1)(delayed(b.do_task)(task) for task in tasks)
File "--\env\lib\site-packages\joblib\parallel.py", line 1054, in __call__
self.retrieve()
File "--\env\lib\site-packages\joblib\parallel.py", line 933, in retrieve
self._output.extend(job.get(timeout=self.timeout))
File "--\env\lib\site-packages\joblib\_parallel_backends.py", line 542, in wrap_future_result
return future.result(timeout=timeout)
File "--\python\python38\lib\concurrent\futures\_base.py", line 439, in result
return self.__get_result()
File "c:\users\okwud\appdata\local\programs\python\python38\lib\concurrent\futures\_base.py", line 388, in __get_result
raise self._exception
selenium.common.exceptions.SessionNotCreatedException: Message: session not created
from disconnected: unable to connect to renderer
(Session info: chrome=91.0.4472.114)
I had this solved yesterday(tasks running in parallel) but I'm now facing an entirely new challenge for which a new post would be made to address it.
How I got the tasks to run in parallel (using my second code - 'what I have tried so far'):
import multiprocessing
#code excluded on purpose
if __name__ == '__main__':
tasks = [
['task1'],
['task2']
]
b = SomeClass()
with multiprocessing.Pool(processes=2) as p:
p.map(b.do_task, tasks)
The do_task
method only returns the number of task done (this is an arbitary return value)