I have two tables in hive:
Table 1:
1,Nail,maher,24,6.2
2,finn,egan,23,5.9
3,Hadm,Sha,28,6.0
4,bob,hope,55,7.2
Table 2 :
1,Nail,maher,24,6.2
2,finn,egan,23,5.9
3,Hadm,Sha,28,6.0
4,bob,hope,55,7.2
5,john,hill,22,5.5
6,todger,hommy,11,2.2
7,jim,cnt,99,9.9
8,will,hats,43,11.2
Is there any way in Hive to retrieve the new data in table 2 that doesn't exist in table 1??
In other Databases tools, you would use a inner left/right. But inner left/right doesn't exist in Hive and suggestions how this could be achieved?
If you are using Hive version >= 0.13 you can use this query:
SELECT * FROM A WHERE A.firstname, A.lastname ... IN (SELECT B.firstname, B.lastname ... FROM B);
But I'm not sure if Hive supports multiple coloumns in the IN clause. If not something like this could work:
SELECT * FROM A WHERE A.firstname IN (SELECT B.firstname FROM B) AND A.lastname IN (SELECT b.lastname FROM B) ...;