Search code examples
sqlduplicatesprimary-keybulk-loadmonetdb

Bulk-loading in monetdb with primary key constraint


I am trying to bulk-load a list of objects into a one-column (primary key) db. The only reason is to remove duplicates. I can't load the list in memory, because the file size is way greater than my memory size (I need around 10^14 insertions!).

I use monetdb's COPY-INTO command, but I don't want it to fail when there is a duplicate. I want it to add everything that is not a duplicate and skip the duplicates.

Is there any way to do that with monetdb? Any other way?


Solution

  • You could copy it first into a table without primary key constraint and afterwards remove the duplicates and alter the table to enforce the primary key constraint.