Search code examples
hadoopapache-pig

Is there a pig / hadoop property that can be set for PIG_HEAPSIZE?


spark, hadoop, tez, etc. all have a list of properties that can be manually configured. example:

yarn.nodemanager.resource.memory-mb

or

spark.executor.memory

or

pig.exec.reducers.bytes.per.reducer, pig.exec.reducers.max

....

Is there an equivalent for PIG_HEAPSIZE? It seems like it can only be set via the environment variable. what is this environment variable doing behind the scenes? which properties is it affecting?


Solution

  • Pig relies on an execution engine such as Tez, Spark, or Mapreduce, so it would inherit the heap sizes from those configurations, such as the Spark executor memory, rather than use its own.

    The only thing that PIG_HEAPSIZE controls is the local JVM driver process where you run the pig command, therefore it only should be a local env-var, not a remote configuration property.