Search code examples
hadoopkerberosparquetcloudera-cdh

Using parquet-tools with Kerberos CDH


I am trying to discover a schema from a parquet file. I tried to use the code:

parquet-tools schema hdfs://<MY_IP>:8020//<PATH_TO_PARQUER>/<PARQUET_FILE_NAME>.parquet

But I got the error:

SIMPLE authentication is not enabled.  Available:[TOKEN, KERBEROS]

Does anyone knows how to use parquet-tools in a Kerberized environment. I have the keytab with the permissions, and I run before the knit command.


Solution

  • The configuration for hadoop.security.authentication can take the values SIMPLE or KERBEROS.

    From the error you get, It's clear that it is set to KERBEROS.

    1. Make sure you run it after kinit.

    2. If it does not work, you have to check your core-site.xml and hadoop-policy.xml files for proper configuration.