Search code examples
pythonazureversion-controlazure-sdk-python

Python Azure SDK how to get a previous version of a blob


I am currently using python v3.8.8 with version 12.9.0 of azure.storage.blob and 1.14.0 of azure.core.

I want to use the python SDK in azure to control blob versioning. Ideally I would like the following 2 functions for a given BlobClient.

  • Get a version of the blob by the specified versionId.
  • Set a version of this blob to be the 'current version' of the blob.

Within my azure account I have activated versioning. This is my set up so far.

from azure.storage.blob import ContainerClient

container_client = ContainerClient(
  my_account_name, 
  my_container_name, 
  credential = my_credentials
)

container_client.upload_blob(my_blob_name, dummy_data)

blob_client = container_client.get_blob_client(my_blob_name)

blob_properties = blob_client.get_blob_properties()

for key, value in blob_properties.__dict__.items():
  print(f'{key}: {value}')

When looking through the blob properties I can see the version_id is this timestamp and is_current_version is True. I then used the follow to upload a new version.

blob_client.upload_blob(edited_dummy_data, overwrite=True)

blob_properties = blob_client.get_blob_properties()

for key, value in blob_properties.__dict__.items():
  print(f'{key}: {value}')

Then the version ID has changed and is_current_version is still true. On azure portal I can see there is a previous version. I can list these version using the python SDK using the following code.

blob_list = container_client.list_blobs(name_starts_with = my_blob_name, include = ['versions'])

for blob_property in blob_list:
  print(blob_property.name, blob_property.version_id)

However when trying to get access to the different versions of the blobs using the following I only get returned the current version of the blob.

blob_list = container_client.list_blobs(name_starts_with = my_blob_name, include = ['versions'])

for blob_property in blob_list:

  blob_client = container_client.get_blob_client(blob_property)
  blob_properties = blob_client.get_blob_properties()

  for key, value in blob_properties.__dict__.items():
    print(f'{key}: {value}')

A similar question was posted about this here but not for python. Moreover I would like revert a blob to a previous id if required (I know this can be done in the azure portal using the guide here).

I have tried using the parameter versionId in the name of the blob like below.

blob_client = container_client.get_blob_client(my_blob_name + '?versionId=' + old_version_id)

Looking at the documentation here there is no keyword for version id. Though I did try, to get an unsurprising error.

Any help would be greatly appreciated or simply knowing that this functionality is not available would also be useful.


Solution

  • get_blob_properties method accepts version_id parameter so what you would need to do is provide that when fetching the properties of a versioned blob.

    Essentially you would need to change the following line of code:

    blob_properties = blob_client.get_blob_properties()
    

    to

    blob_properties = blob_client.get_blob_properties(version_id='blob version id')
    

    UPDATE

    To overwrite a base blob with one of its versions, you would simply use start_copy_from_url method on blob client and provide the URL of the versioned blob to it. URL of the versioned blob will be same as that of the base blob but with versionId as query string parameter.