Search code examples
azureazure-blob-storageazure-sdk-python

Copy files from blob container to another container using python


I am trying to copy 'specific files' from one folder to another. when I am trying to use Wild card operator (*) at the end, the copy does not happen.

But if I provide just the folder name, then all the files from this source folder are copied to target folder without any issues.

Problem: File copy does not happen when Wild card operator is used. Can you please help me to fix the problem?

def copy_blob_files(account_name, account_key, copy_from_container, copy_to_container, copy_from_prefix):
  try:
    blob_service = BlockBlobService(account_name=account_name, account_key=account_key)
    files = blob_service.list_blobs(copy_from_container, prefix=copy_from_prefix)

    for f in files:
      #print(f.name)
      blob_service.copy_blob(copy_to_container, f.name.replace(copy_from_prefix,""), f"https://{account_name}.blob.core.windows.net/{copy_from_container}/{f.name}")
  except:
    print('Could not copy files from source to target')

copy_from_prefix = 'Folder1/FileName_20191104*.csv'
copy_blob_files (accountName, accesskey, copy_fromcontainer, copy_to_container, copy_from_prefix)

Solution

  • The copy_blob method does not support wildcard.

    1.If you want to copy specified pattern of blobs, you can filter the blobs in list_blobs() method with prefix(it also does not support wildcard). In your case, the prefix looks like copy_from_prefix = 'Folder1/FileName_20191104', note that there is no wildcard.

    The code below works at my side, and all the specified pattern files are copies and blob name replaced:

    from azure.storage.blob import BlockBlobService
    
    account_name ="xxx"
    account_key ="xxx"
    
    copy_from_container="test7"
    copy_to_container ="test4"
    
    #remove the wildcard
    copy_from_prefix = 'Folder1/FileName_20191104'
    
    def copy_blob_files(account_name, account_key, copy_from_container, copy_to_container, copy_from_prefix):
        try:
            block_blob_service = BlockBlobService(account_name,account_key)
            files = block_blob_service.list_blobs(copy_from_container,copy_from_prefix)
            for file in files:
                block_blob_service.copy_blob(copy_to_container,file.name.replace(copy_from_prefix,""),f"https://{account_name}.blob.core.windows.net/{copy_from_container}/{file.name}")
    
        except:
            print('could not copy files')
    
    copy_blob_files(account_name,account_key,copy_from_container,copy_to_container,copy_from_prefix)
    

    2.Another way as others mentioned, you can use python to call azcopy(you can use azcopy v10, which is just a .exe file). And for using wildcard in azcopy, you can follow this doc. Then you write you own azcopy command, at last, write your python code as below:

    import subprocess
    
    #the path of azcopy.exe, v10 version
    exepath = "D:\\azcopy\\v10\\azcopy.exe"
    
    myscript= "your azcopy command"
    
    #call the azcopy command
    subprocess.call(myscript)