my case is the following:
What I made so far:
from azure.storage.blob import BlobServiceClient, BlobClient, ContainerClient, BlobLeaseClient, BlobPrefix, ContentSettings
# Set the connection string for the source and destination storage accounts
SOURCE_CONNECTION_STRING = "your SOURCE connection string"
DESTINATION_CONNECTION_STRING = "your DESTINATION connection string"
# Create the BlobServiceClient objects for the source and destination storage accounts
source_blob_service_client = BlobServiceClient.from_connection_string(SOURCE_CONNECTION_STRING)
destination_blob_service_client = BlobServiceClient.from_connection_string(DESTINATION_CONNECTION_STRING)
# List all containers in the source storage account
source_containers = source_blob_service_client.list_containers()
# Iterate through each container in the source storage account
for source_container in source_containers:
print(f"Processing container '{source_container.name}'...")
# Create a new container in the destination storage account (if it doesn't exist already)
destination_container = destination_blob_service_client.get_container_client(source_container.name)
if not destination_container.exists():
print(f"Creating container '{source_container.name}' in the destination storage account...")
destination_container.create_container()
# Get a list of all blobs in the current source container
source_container_client = source_blob_service_client.get_container_client(source_container.name)
source_blobs = source_container_client.list_blobs()
#source_blobs = source_blob_service_client.list_blobs(source_container.name)
# Iterate through each blob in the current source container
for source_blob in source_blobs:
# Check if the blob already exists in the destination container
destination_blob = destination_blob_service_client.get_blob_client(source_container.name, source_blob.name)
print(source_blob)
if not destination_blob.exists() or source_blob.last_modified > destination_blob.get_blob_properties().last_modified:
# Copy the blob to the destination container (with the same directory structure as in the source)
#source_blob_client = BlobClient.from_blob_url(source_blob.url)
source_blob_client = BlobClient.from_blob_url(source_blob.url)
destination_blob.start_copy_from_url(source_url=source_blob.url)
print(f"Copied blob '{source_blob.name}' to container '{source_container.name}' in the destination storage account.")
However I get an error -- AttributeError: 'BlobProperties' object has no attribute 'url' -- while in the this notebook https://github.com/Azure-Samples/AzureStorageSnippets/blob/master/blobs/howto/python/blob-devguide-py/blob-devguide-blobs.py & https://learn.microsoft.com/en-us/python/api/azure-storage-blob/azure.storage.blob.blobclient?view=azure-python#azure-storage-blob-blobclient-start-copy-from-url - I see it being used.
Can someone suggest what am I doing wrong? I have opted for python due to the iterative requirement (go to the most granular level of each container), which seemed not doable in Synapse via pipeline activities.
I tried in my environment and got below results:
Initially, I got an same error in my environment.
I got an error -- AttributeError: 'BlobProperties' object has no attribute 'url' -- while in the this notebook
The above error occurs due to source_blob
object is of type BlobProperties
, which doesn't have a url
attribute. Instead, you should use the source_blob_client
object you created earlier to get the source blob URL.
Code:
from azure.storage.blob import BlobServiceClient, BlobClient, ContainerClient, BlobLeaseClient, BlobPrefix, ContentSettings
# Set the connection string for the source and destination storage accounts
SOURCE_CONNECTION_STRING = "<src_connect_strng>"
DESTINATION_CONNECTION_STRING = "<dest_connect_strng>"
# Create the BlobServiceClient objects for the source and destination storage accounts
source_blob_service_client = BlobServiceClient.from_connection_string(SOURCE_CONNECTION_STRING)
destination_blob_service_client = BlobServiceClient.from_connection_string(DESTINATION_CONNECTION_STRING)
# List all containers in the source storage account
source_containers = source_blob_service_client.list_containers()
# Iterate through each container in the source storage account
for source_container in source_containers:
print(f"Processing container '{source_container.name}'...")
# Create a new container in the destination storage account (if it doesn't exist already)
destination_container = destination_blob_service_client.get_container_client(source_container.name)
if not destination_container.exists():
print(f"Creating container '{source_container.name}' in the destination storage account...")
destination_container.create_container()
# Get a list of all blobs in the current source container
source_container_client = source_blob_service_client.get_container_client(source_container.name)
source_blobs = source_container_client.list_blobs()
# Iterate through each blob in the current source container
for source_blob in source_blobs:
# Check if the blob already exists in the destination container
destination_blob = destination_blob_service_client.get_blob_client(source_container.name, source_blob.name)
print(source_blob.name)
source_blob_client = source_blob_service_client.get_blob_client(source_container.name, source_blob.name)
print(source_blob_client.url)
destination_blob.start_copy_from_url(source_url=source_blob_client.url)
print(f"Copied blob '{source_blob.name}' to container '{source_container.name}' in the destination storage account.")
Console:
The above code executed and successfully copied same structure from one storage account to another storage account using synapse.
Portal: In portal I can able to see the destination account as same structure as source account.