Search code examples
hadoopimporthivesybasesqoop

incremental import in sqoop on a table with jumbled data and no modified time column


Suppose I have a table Customer :

CustomerID  CustomerName  CustomerBill
7           John          100
2           Bill          500
4           Mark          200

Here CustomerID is the primary key but the records are in no particular order. There is no modified time column in the corresponding table in the database. The previous entries can change as well. How do I do incremental imports on the data?

The database I am using is Sybase and importing it to Hive.


Solution

  • Records are in no particular order.

    append mode can not be used.

    There is no modified time column in the corresponding table in the database.

    lastmodified mode can not be used.

    Sqoop does do anything special. It needs incrementing ID or updated timstamp to make a SQL query to fetch ONLY inserted/updated recored.