Search code examples
mysqlaws-glue

AWS Glue Not Importing Data for Boolean Columns from MySql DB


I have a datasource that is a MySQL db. That db has a table with a column that is defined as a bit:

is_available    bit      default b'0'              not null,

When I run a crawler against that table, it brings that column into the data catalogue as a boolean.

Then if I create a datasource in my visual ETL job to read in that table, that column exist but all the values are blank.

It does the same thing if I use a JDBC connection right to the DB instead of using the data catalogue.

If I modify the schema in the data catalogue to define the column as an int, the data source in the ETL job ignores that and calls it a boolean anyway, this is still what it has in the script:

mappings=[("is_available", "boolean", "is_available", "boolean"),... ]

So it aparently ignores the schema defined in data catalogue.

Any idea what I need to do get the values in the ETL job to populate instead of all being blank?


Solution

  • Looks like the answer is to add a transform query after the data source and define it as an int there. Seems odd that it ignores the schema defined in the data catalogue, and that int works and not boolean, but here we are.