I am about to plan out a rough concept how the data lake should be structured. One thing struck me regarding the container concept. My question is, are there any benefits to have multiple containers? E.g. having multiple containers for each use case? But i also can picture it in one container with different folders. From the perspective of access management, container level rbac, folder-, file- level acl. The only difference is the security concept if i have one container with multiple folders for use cases or multiple containers for each use case. Are there like killer arguments why we should follow one approach instead of the other?
First, the benefits of container:
we can enable container soft delete and restore a soft-deleted container but the folder does not have this function.
We can set access level of the container when we create it:
So I think the container has more security and isolation. From the perspective of access management, containers have more coarse-grained access control. How to choose depends on individual needs.