I composed a simple left out join hiveql
select * from a left outer join b on (a.f1=b.f1 and a.f2=b.f2)
The total count of above query result is 798,608
.
However, the total number of records in table a is 780,499
, which doesn't match.
I tried to find all records that only exist in the left join results but not in table a; the results returned blank.
I even tried to create 2 small tables (a' and b') with a few records and the count of the left join result matches the count of table a' records, as expected.
What could cause the inconsistent results?
Thanks to David Lee. There are 1 to many situation in table b. Problem solved.