Search code examples
pythonamazon-web-servicesaws-lambdakaggle

Cannot download file to AWS Lambda


I have an AWS Lambda function that downloads a file. I have read that the only directory I can write to is the /tmp directory, however I am still getting this error.

[ERROR] OSError: [Errno 30] Read-only file system: '/home/sbx_user1051'
Traceback (most recent call last):
  File "/var/lang/lib/python3.8/imp.py", line 234, in load_module
    return load_source(name, filename, file)
  File "/var/lang/lib/python3.8/imp.py", line 171, in load_source
    module = _load(spec)
  File "<frozen importlib._bootstrap>", line 702, in _load
  File "<frozen importlib._bootstrap>", line 671, in _load_unlocked
  File "<frozen importlib._bootstrap_external>", line 783, in exec_module
  File "<frozen importlib._bootstrap>", line 219, in _call_with_frames_removed
  File "/var/task/lambda_function.py", line 9, in <module>
    from kaggle.api.kaggle_api_extended import KaggleApi
  File "/var/task/kaggle/__init__.py", line 19, in <module>
    from kaggle.api.kaggle_api_extended import KaggleApi
  File "/var/task/kaggle/api/__init__.py", line 22, in <module>
    from kaggle.api.kaggle_api_extended import KaggleApi
  File "/var/task/kaggle/api/kaggle_api_extended.py", line 84, in <module>
    class KaggleApi(KaggleApi):
  File "/var/task/kaggle/api/kaggle_api_extended.py", line 102, in KaggleApi
    os.makedirs(config_dir)
  File "/var/lang/lib/python3.8/os.py", line 213, in makedirs
    makedirs(head, exist_ok=exist_ok)
  File "/var/lang/lib/python3.8/os.py", line 223, in makedirs
    mkdir(name, mode)

This is the code producing the error:

from kaggle.api.kaggle_api_extended import KaggleApi

def lambda_handler(event, context):
    api = KaggleApi()
    api.authenticate()

    api.dataset_download_file(
        "gpreda/covid-world-vaccination-progress",
        "country_vaccinations.csv",
        "/tmp",
    )
    return {"statusCode": 400, "body": bucket}

Solution

  • As pointed out by @joran the import is trying to create some config directories

        config_dir = os.environ.get('KAGGLE_CONFIG_DIR') or os.path.join(
            expanduser('~'), '.kaggle')
        if not os.path.exists(config_dir):
            os.makedirs(config_dir)
    

    You can set environment variables Using AWS Lambda environment variables or in this case you can point this directly to /tmp/ because that's the only directory available for you to write anything.

    the corresponding code

    From the documentation, it seems like you just need config credentials which you can put in AWS Parameter Store and fetch them inside your lambda.

    Kaggale API Credentials

    export KAGGLE_USERNAME=datadinosaur
    export KAGGLE_KEY=xxxxxxxxxxxxxx
    

    Once you fetched and exported the credentials then you can add the import statement for the API.

    Or if you are adventurous enough can modify the code a bit and try to create a Configuration class object eventually use that in the initialization

            self.username = ""
            # Password for HTTP basic authentication
            self.password = ""
    
    

    Configuration Class