Search code examples
amazon-web-servicesamazon-iamaws-glue

How to restrict access to a single glue catalog database with IAM Role Policy?


In my glue data catalog, there are many glue data catalog databases. I'm trying to write an IAM Role policy that would deny access to every GDC database, except for one whitelisted database. How can this be done?

In my first attempt, I used the managed policy arn:aws:iam::aws:policy/AWSGlueConsoleFullAccess as a starting point, and added the statement shown below to deny all databases except the whitelisted one.

{
    "Version": "2012-10-17",
    "Statement": [

        # same statements as in arn:aws:iam::aws:policy/AWSGlueConsoleFullAccess ...
        
        {
            "Sid": "DenyAccessToOtherGlueDatabases",
            "Effect": "Deny",
            "Action": [
                "glue:GetDatabase"
            ],
            "NotResource": [      
                "arn:aws:glue:${region}:${account_number}:database/${database_name}"
            ]
        }
    ]
}

However, the Role is still able to access all the databases in the catalog despite this rule. What could be the issue?


I made a second attempt after finding this documentation, which reads:

To deny access to a table, requires that you create a policy to deny a user access to the table, or its parent database or catalog. This allows you to easily deny access to a specific resource that cannot be circumvented with a subsequent allow permission. For example, if you deny access to table books in database db1, then if you grant access to database db1, access to table books is still denied. The following is an example identity-based policy that denies permissions for AWS Glue actions (glue:GetTables and GetTable) to database db1 and all of the tables within it.

{
"Version": "2012-10-17",
"Statement": [
{
"Sid": "DenyGetTablesToDb1",
"Effect": "Deny",
"Action": [
  "glue:GetTables",
  "glue:GetTable"        
],
"Resource": [      
  "arn:aws:glue:us-west-2:123456789012:database/db1"
]
}
]
}

I tried to deny glue:GetTables and glue:GetTable on the table:

{
    "Version": "2012-10-17",
    "Statement": [

        # same statements as in arn:aws:iam::aws:policy/AWSGlueConsoleFullAccess ...
        
        {
            "Sid": "DenyAccessToOtherGlueDatabases",
            "Effect": "Deny",
            "Action": [
                "glue:GetTables",
                "glue:GetTable"        
            ],
            "NotResource": [ 
                "arn:aws:glue:${region}:${account_number}:table/${landing_zone_name}_${env}"
            ]
        }
    ]
}

I also tried to deny glue:GetTables and glue:GetTable on the database as shownin the example:

{
    "Version": "2012-10-17",
    "Statement": [

        # same statements as in arn:aws:iam::aws:policy/AWSGlueConsoleFullAccess ...
        
        {   
            "Sid": "DenyAccessToOtherGlueDatabases",
            "Effect": "Deny",
            "Action": [
                "glue:GetTables",
                "glue:GetTable"        
            ],
            "NotResource": [ 
                "arn:aws:glue:${region}:${account_number}:database/${landing_zone_name}_${env}"
            ]
        }
    ]
}

For either case, when I tried to access tables that I should have had permission to access, I was met with an infinite loading animation that said "fetching <table_name>"

Screenshot of loading problem


Solution

  • To restrict access to a single glue data catalog database, you need to whitelist every resource in the glue data catalog hierarchy (Catalog -> DB -> Table) with NotResource as shown in DenyAccessToOtherGlueDatabases below. This will allow the role to access only those specified databases, and will forbid all others from being accessed.

    Also, you should use these four actions (more info on each one in the docs):

    1. glue:GetTable: Grants permission to retrieve a specific table
    2. glue:GetTables: Grants permission to retrieve the tables for a specific database
    3. glue:GetDatabase Grants permission to retrieve a specific database
    4. glue:GetDatabases: Grants permission to retrieve all databases (without this, the databse will not appear in the console.)
    {
        "Version": "2012-10-17",
        "Statement": [
    
            # same statements as in arn:aws:iam::aws:policy/AWSGlueConsoleFullAccess ...
            
            {
                "Sid": "",
                "Effect": "Deny",
                "Action": [
                    "glue:GetTable",
                    "glue:GetTables",
                    "glue:GetDatabase",     
                    "glue:GetDatabases"
                ],
                "NotResource": [ 
                    "arn:aws:glue:${region}:${account_number}:catalog",         
                    "arn:aws:glue:${region}:${account_number}:database/${database_name}",             
                    "arn:aws:glue:${region}:${account_number}:table/${database_name}/*",
                ]
            }
        ]
    }