Search code examples
sqlhadoophivehiveqlimpala

Inconsistent Hive Left Join Results


I composed a simple left out join hiveql select * from a left outer join b on (a.f1=b.f1 and a.f2=b.f2) The total count of above query result is 798,608. However, the total number of records in table a is 780,499, which doesn't match.

I tried to find all records that only exist in the left join results but not in table a; the results returned blank.

I even tried to create 2 small tables (a' and b') with a few records and the count of the left join result matches the count of table a' records, as expected.

What could cause the inconsistent results?


Solution

  • Thanks to David Lee. There are 1 to many situation in table b. Problem solved.