Search code examples
hadoophivemapreducehive-configuration

JOIN in Hive triggers which type of JOIN in MapReduce?


If I have a query in hive which employs JOIN, lets say a LEFT OUTER JOIN or an INNER JOIN on two tables ON any column, then how do I know which type of JOIN is it getting converted into in the back-end MapReduce (i.e. Map-side JOIN or Reduce-side JOIN) ?

Thanks.


Solution

  • Use explain select ... and check the plan. It explains what exactly map and reduce will do. Also during execution you can check logs on job tracker and see what mapper or reducer processes are doing.

    For example the following piece of explain plan says that it is map-side join (Note Map Join Operator in the plan):

     Stage: Stage-33
        Map Reduce
          Map Operator Tree:
              TableScan
                **alias: s**
                filterExpr: (col is not null) (type: boolean)
                Statistics: Num rows: 85 Data size: 78965 Basic stats: COMPLETE Column stats: NONE
                Filter Operator
                  predicate: (col is not null) (type: boolean)
                  Statistics: Num rows: 22 Data size: 20438 Basic stats: COMPLETE Column stats: NONE
                  **Map Join Operator
                    condition map:
                         Inner Join 0 to 1**