Search code examples
pythonazure-data-lake

Create file system/container if not found


I'm trying to export a CSV to an Azure Data Lake Storage but when the file system/container does not exist the code breaks. I have also read through the documentation but I cannot seem to find anything helpful for this situation.

How do I go about creating a container in Azure Data Lake Storage if the container specified by the user does not exist?

Current Code:

    try:
        file_system_client = service_client.get_file_system_client(file_system="testfilesystem")
    except Exception:
        file_system_client = service_client.create_file_system(file_system="testfilesystem")

Traceback:

(FilesystemNotFound) The specified filesystem does not exist.
RequestId:XXXX
Time:2021-03-31T13:39:21.8860233Z

Solution

  • The try catch pattern should be not used here since the Azure Data lake gen2 library has the built in exists() method for file_system_client.

    First, make sure you've installed the latest version library: azure-storage-file-datalake 12.3.0. If you're not sure which version you're using, please use pip show azure-storage-file-datalake command to check the current version.

    Then you can use the code below:

    from azure.storage.filedatalake import DataLakeServiceClient
    
    service_client = DataLakeServiceClient(account_url="{}://{}.dfs.core.windows.net".format(
               "https", "xxx"), credential="xxx")
    
    #the get_file_system_client method will not throw error if the file system does not exist, if you're using the latest library 12.3.0
    file_system_client = service_client.get_file_system_client("filesystem333")
    
    print("the file system exists: " + str(file_system_client.exists()))
    
    #create the file system if it does not exist
    if not file_system_client.exists():
        file_system_client.create_file_system()
        print("the file system is created.")
    
    #other code
    

    I've tested it locally, it can work successfully:

    enter image description here