Search code examples
hadoophiveapache-spark-sqlimpalahive-metastore

Hive/Impala column-comments cut off after several characters


When I take a look at the column-comments in our Data Lake (Hadoop, comments made during parquet-table creation with Hive or Impala) they are cut of after ~200 characters.

Might this be a global character-setting in our hadoop-system or some Hive-restriction? If not, is there a way to set the maximum-string-length for comments during the table creation? Unfortunately, I have no admin-access to the system itself and, therefore, restricted insights.


Solution

  • Column comments are stored in Hive Metastore table COLUMNS_V2, in a column called COMMENT. Currently, the size of that column is limited to 256 characters (see MySQL metastore schema definition for Hive version 3.0.0 for example). In the upcoming 4.0 (?) version, it seems to have been expanded to varchar(4000), but associated Hive JIRA-4921 is still listed as unresolved, and doesn't mention a target release #.