I have a container that is built to run selenium-chromedriver with python to download an excel(.xlsx) file from a website.
I am Using SAM to build & deploy this image to be run in AWS Lambda.
When I build the container and invoke it locally, the program executes as expected: The download occurs and I can see the file placed in the root directory of the container.
The problem is: when I deploy this image to AWS and invoke my lambda function I get no errors, however, my download is never executed. The file never appears in my root directory.
My first thought was that maybe I didn't allocate enough memory to the lambda instance. I gave it 512 MB, and the logs said it was using 416MB. Maybe there wasn't enough room to fit another file inside? So I have increased the memory provided to 1024 MB, but still no luck.
My next thought was that maybe the download was just taking a long time, so I also allowed the program to wait for 5 minutes after clicking the download to ensure that the download is given time to complete. Still no luck.
I have also tried setting the following options for chromedriver (full list of chromedriver options posted at bottom):
options.add_argument(f"--user-data-dir={'/tmp'}"),
options.add_argument(f"--data-path={'/tmp'}"),
options.add_argument(f"--disk-cache-dir={'/tmp'}")
and also setting tempfolder = mkdtemp()
and passing that into the chrome options as above in place of /tmp
. Still no luck.
Since this applicaton is in a container, it should run the same locally as it does on AWS. So I am wondering if it is part of the config outside of the container that is blocking my ability to download a file? Maybe the request is going out but the response is not being allowed back in?
Please let me know if there is anything I need to clarify -- Any help on this issue is greatly appreciated!
Full list of Chromedriver options
options.binary_location = '/opt/chrome/chrome'
options.headless = True
options.add_argument('--disable-extensions')
options.add_argument('--no-first-run')
options.add_argument('--ignore-certificate-errors')
options.add_argument('--disable-client-side-phishing-detection')
options.add_argument('--allow-running-insecure-content')
options.add_argument('--disable-web-security')
options.add_argument('--lang=' + random.choice(language_list))
options.add_argument('--user-agent=' + fake_user_agent.user_agent())
options.add_argument('--no-sandbox')
options.add_argument("--window-size=1920x1080")
options.add_argument("--single-process")
options.add_argument("--disable-dev-shm-usage")
options.add_argument("--disable-dev-tools")
options.add_argument("--no-zygote")
options.add_argument(f"--user-data-dir={'/tmp'}")
options.add_argument(f"--data-path={'/tmp'}")
options.add_argument(f"--disk-cache-dir={'/tmp'}")
options.add_argument("--remote-debugging-port=9222")
options.add_argument("start-maximized")
options.add_argument("enable-automation")
options.add_argument("--headless")
options.add_argument("--disable-browser-side-navigation")
options.add_argument("--disable-gpu")
driver = webdriver.Chrome("/opt/chromedriver", options=options)```
Just in case anybody stumbles across this queston in future, adding the following to chrome options solved my issue:
prefs = {
"profile.default_content_settings.popups": 0,
"download.default_directory": r"/tmp",
"directory_upgrade": True
}
options.add_experimental_option("prefs", prefs)