Search code examples
hadoophiveapache-tez

Persistance of parameter in Hive HQL?


I use a cluster with Hive.

The cluster has a specific tez continer size (set via Ambari).

However, we have a certain hive operation that processes more data than the other. It is the only one.

Consequently, we plan to change tez container size just for this process. This brings two questions :

  • Is it possible to set hive.tez.container.size and hive.tez.java.opts in HQL (like set hive.tez.java.opts=XXX) ?
  • What is the scope and persistance of this action ? If I set this for one query, do I have to set it back to the original value or is it taken into account only for this query/tez session/other ?

Solution

  • You can set these parameters in the script like this:

    set tez.am.resource.memory.mb=8192;
    set tez.am.java.opts=-Xmx6144m;
    set tez.reduce.memory.mb=6144;    
    set hive.tez.container.size=9216;   
    set hive.tez.java.opts=-Xmx6144m;
    

    And the scope will be the whole session if not re-defined before the query. If you have many queries in single script and you need different parameters for each query, set parameters before each query. It is not possible to set different parameters for different vertices like map1 and map2.

    Read this article: Demystify Apache Tez Memory Tuning Step by Step