So I've multiple microservices code uploaded on Google Cloud Storage as zip files in different buckets like livewinpkgs, uatwinpkgs, devwinpkgs, etc. Inside these buckets, there are various microservices folder like customerservice, loginservice, checkoutservice, etc.
And in the services folders, we have been maintaining the names as customerservice.1.0.1.zip, customerservice.1.0.2.zip, etc.
So the tree structure is something like this
livewinpkgs--->customerservice--->customerservice.1.0.1.zip,customerservice.1.0.2.zip, etc which goes on for a long list of versions.
Now, my requirement is that I need to setup a policy wherein I'll delete all the versions from each of the services folders except for the last 2 or 3 versions. I know such things can be achieved via object versioning. It works flawlessly when our file name is same and it just overwriting current versions with new version. But in my case, the name keeps on changing for every version like 1.0.1, 1.0.2, 1.0.3 with the service name appended to it.
Another option was to setup the delete lifecycle using Age option but for some services we don't do any changes for many months so if I setup a policy to delete the files which are older than 90 days then there are chances that I may loose the current live version also.
Just sharing a reference image for better understanding. As seen in the above image, this is how my files looks like in the bucket folder except for the last one(version4.txt) which has multiple versions of it. Now, I want to keep let's say only 2 versions so the policy should ideally delete version.1.0.1.zip and version.1.0.2.zip while maintaining the version.1.0.3.zip
What should be the best approach in such cases where I want to keep the last versions and delete anything before that. The last 2 or 3 versions can be known via the created date but it will be different for different services.
Well the conditions you can use to write lifecycle rules are relatively limited in what you can do: https://cloud.google.com/storage/docs/lifecycle#conditions
So if you use a different object name for every revision, cleaning up with just lifecycle rules seems impossible to me.
If you were to switch to purely GCS object versioning for version management it is however trivial to do:
"rule": [
{
"action": {
"type": "Delete"
},
"condition": {
"numNewerVersions": 2
}
}
]
}
If you do need to use different object names each time, the easiest solution to me seems to make a small script with logic to determine the latest 2 versions of a file based on the list of object names (and/or their modification dates) and which then deletes the oldest versions. Add this script as a cloud function, give the correct service account rights so it can list and delete in the bucket and then trigger this script with a cloud scheduler job every x hours/days.
Alternatively you could use a cloud function which is triggered on object create notifications and checks the number of older versions of the newly created object and cleans them up if needed.