How to compare differences in very large csv files

I have to compare two csv files with a size of 2-3 GB each, contained in Windows platform.

I've tried to put the first one in a HashMap to compare it with the second one, but the result (as expected) is a very high memory cosumption.

The target is to get the differences in another file.

The lines may appear in diffent order, and maybe missed also.

Any suggetions?

Solution

Assuming you wish to do this in Java, via programming, the answers are different.

Are both of the files ordered? If so, then you don't need to read in whole files, you simply start at the beginning of both files, and

If the entries match, advance the "current" line in both files.
If the entries don't match, determine which file's line would come first, display that line, and advance the current line in that file.

If you don't have ordered files, then perhaps you could order the files prior to the diff. Again, since you need a low memory solution, don't read the entire file in to sort it. Chop the file up into manageable chunks, and then sort each chunk. Then use insertion sort to combine the chunks.