I have a AWS redshift cluster (say Cluster A) and a database (say db A) in it. I have created an external schema (say sch A) and created several external tables in it, which have their data in s3.
Now, I want to create another external schema (say sch B) where I want to create some other external tables. My intention is to create these schemas from admin user and provide separate users access to separate schemas using GRANT USAGE ON
.
However, after creating sch B, I observe that all tables which I created in sch A are visible and could be queried.
Any idea about why is this happening and how to prevent this. Please suggest.
Note- I created both the external schemas with same IAM role which has policy to read the s3 bucket where the data resides. I don't think this is the issue.
You are running into a confusion lots of people run into.
An external schema is badly misnamed - it is not like a schema at all.
An external schema is and is only a pointer to an external database. The external table metadata is stored in that extenral database.
You can have any number of external schemas pointing to the same external database, and when you create a table using any of those external schema, the table metadata will be written to the external database the external schema points at - which means, of course, it will be seen through all the other external schemas because they all point at the same external database.
If you want the content of the schemas to be visible in only the one schema, you need to make a new external database for each schema. Note that having too many external databases seems (I've yet to investigate properly, but it looks like it) to make the system tables which carry external table information extremely slow - tens of minutes to return a query. I never use GUI tools for SQL, but I could imagine they are querying this system table and it could for them be a problem.