Search code examples
amazon-web-servicespysparkamazon-redshiftorc

Error "declared column type INT for column id incompatible with ORC file column type string query" when copy orc to Redshift


Error "declared column type INT for column id incompatible with ORC file column type string query" when copy orc to Redshift using the command:

from 's3://' 
iam_role 'role' 
format as orc;

Solution

  • It happens when the columns in orc are not in the same order of redshift columns, you can solve this with:

    df = df.select('col1', 'col2', 'col3')