Having a problem with one of our external tables in redshift.
We have over 300 tables in AWS Glue which have been added to our redshift cluster as an external schema called events
. Most of the tables in events
can be queries fine. But when querying one of the tables called item_loaded
we get the following error;
select * from events.item_loaded limit 1;
ERROR: XX000: Failed to incorporate external table "events"."item_loaded" into local catalog.
LOCATION: localize_external_table, /home/ec2-user/padb/src/external_catalog/external_catalog_api.cpp:358
What's weird is that they are in the catalog;
select *
from SVV_EXTERNAL_TABLES
where tablename = 'item_loaded';
-[ RECORD 1 ]-----+------------------------------------------
schemaname | events
tablename | item_loaded
location | s3://my_bucket/item_loaded
input_format | org.apache.hadoop.hive.ql.io.parquet.MapredParquetInputFormat
output_format | org.apache.hadoop.hive.ql.io.parquet.MapredParquetOutputFormat
serialization_lib | org.apache.hadoop.hive.ql.io.parquet.serde.ParquetHiveSerDe
serde_parameters | {"serialization.format":"1"}
compressed | 0
parameters | {"EXTERNAL":"TRUE","parquet.compress":"SNAPPY","transient_lastDdlTime":"1504792238"}
AFAICT, this table is configured the exact same way as the other tables in the same schema which are working fine. I've tried recreating a new external schema pointing to the same AWS Glue database but the same issue occurs.
What else could I potentially check? Is there anything that could occur which would cause a table to removed from the catalog?
As per the forum post about the same:
The external table has a number of columns which exceed the Redshift limits:
You can verify the number of columns of external table by querying svv_external_columns