Search code examples
hadoophivehbasehive-configuration

Avoid user from overriding default mapred property in hadoop 2


How to avoid user from overriding the default property of hadoop configuration file when submitting hive jobs?

Exmaple:

mapred-site.xml:

<property>
    <name>mapreduce.job.heap.memory-mb.ratio</name>
    <value>0.8</value>
  </property>

User use below propery in hive job to override

set mapreduce.job.heap.memory-mb.ratio=0.9 

Solution

  • From Hadoop documentation:

    Configuration parameters may be declared final. Once a resource declares a value final, no subsequently-loaded resource can alter that value (...) Administrators typically define parameters as final in core-site.xml for values that user applications may not alter.

    <property>
      <name>dfs.hosts.include</name>
      <value>/etc/hadoop/conf/hosts.include</value>
      <final>true</final>
    </property>
    

    So, if your users connect via JDBC, you just have to tinker with the config. files used by HiveServer2 to make some props "final".

    If your users connect with the legacy hive CLI, and they are not hackers, you just have to (a) tinker with the global conf for Hadoop clients, or (b) tinker with the "hive" launcher script so that it picks specific config files in a non-default directory (typically done by forcing the custom dir ahead of the standard Hadoop CLASSPATH).

    If your users are hackers and they have access to the legacy hive CLI, they could override the config files themselves so technically you cannot enforce <final> properties. But anyway, if someone can achieve that, then he/she will probably get your job anyway ;-)