Hive multilevel partitions and select with where clause

I have hive table with 2 partitions and the 1st partition is the city and the second partition is village, So every city partition will contains the list of all village partitions in it.Some thing like below

city1/village1
city1/village2
city1/village3
city2/village5
city2/village6

So if my select statement is select * from table where village = 'village5' will it search all the partitions in city 1 and city 2 before outputting the result? Or will it see the hive metastore file and hit only the village5 partition.

Solution

It will depend on your Hive version how optimize it is. In my current version (1.1.0) Hive is able to point to the specific partition without scanning the top partition

Here is a quick demonstration.

create table mydb.partition_test 
(id string)
partitioned by (city string, village string);

INSERT OVERWRITE TABLE mydb.partition_test PARTITION (city,village)
select * from (

select '1', 'city1', 'village1'
union all 
select '1', 'city1', 'village2'
union all 
select '1', 'city1', 'village3'
union all 
select '1', 'city2', 'village5'
union all 
select '1', 'city2', 'village6'
) t;

explain select * from mydb.partition_test where village='village5';

STAGE DEPENDENCIES:
  Stage-0 is a root stage

STAGE PLANS:
  Stage: Stage-0
    Fetch Operator
      limit: -1
      Processor Tree:
        TableScan
          alias: partition_test
          filterExpr: (village = 'village5') (type: boolean)
          Statistics: Num rows: 1 Data size: 1 Basic stats: COMPLETE Column stats: PARTIAL
          Select Operator
            expressions: id (type: string), city (type: string), 'village5' (type: string)
            outputColumnNames: _col0, _col1, _col2
            Statistics: Num rows: 1 Data size: 1 Basic stats: COMPLETE Column stats: PARTIAL
            ListSink

As you can see from the execution plan, it is able to estimate the number of records for that specific partition without mapred operation and the table scan is pointing to the specific partition.