Search code examples
perlmergeperl-hash

How to parse multiple csv files with Perl and print only the unique results


I have a bunch of csv file in a simple format say 'Name,Country,Currency'. I need to read all of them and print only the unique union. If they show up in few files they are identical. Tried to use Hash::Merge but seems to be working only for two. I assume I have to reinitialize it in the loop while opening these files for reading but not sure how. In the end I want a file of the same nature but containing all of them without repetition. Many thanks.

Input looks like:

EDL,Finland,Euro

Output want the same format .I made a loop reading the files ,and at any stage I have two hashes %A and %B with $name as keys (after splitting).

$A{$name}=$coun and $B{$name}=$curr 

I also have two %merged hashes defined as

$merged1 = Hash::Merge->new('LEFT_PRECEDENT'); 
my %merged1 = %{ $merged1->merge( \%merged1, \%A ) }; 

The error I get is complaining about unknown function "merge". Must be a simple thing but cannot see it.


Solution

  • Assuming the lines considered duplicates are identical in all fields, and the data is uniform you can get away with something simple like

    perl -ne'print unless $seen{$_}++' universe* > out.csv 
    

    Which is a simple dedupe routine (deduping by hash key), then redirect output with the shell.