Is there a way to use sqoop to do incremental exports ? I am using Hcatalog integration for sqoop.I tried using the --last-value, --check-column options which are used for incremental import, but sqoop gave me error that the options were invalid.
I have not seen incremental sqoop export arguments. The other way you could try is to create a contol_table in hive where you keep log of the table name & timestamp when it was last exported every time.
create table if not exists control_table (
table_name string,
export_date timestamp
);
insert into control_table 'export_table1' as table_name, from_unixtime(unix_timestamp()) as export_date from control_table;
If export_table1 is the table you want to export incrementally and assuming if have already executed above two statements.
--execute below at once
--get the timestamp when the table was last executed
create temporary table control_table_now as select table_name, max(export_date) as last_export_date from control_table group by table_name;
--get incremental rows
create table new_export_table1 as select field1, field2, field3, .... timestamp1 from export_table1 e, control_table_now c where c.table_name = 'export_table1' and e.timestamp1 >= c.last_export_date;
--append the control_table for next process
insert into control_table 'export_table1' as table_name, from_unixtime(unix_timestamp()) as export_date from control_table;
Now, export the new_export_table1 table which is incrementally created using sqoop export command.