Search code examples
unixcsv

Unix:merge multiple CSV files with same header by keeping the header of the first file


I have to merge multiple CSV files with same headers. I have to keep the header of the first file and remove headers of all the other files and merge them and create one master file.

file 1:

Id,city,name ,location
1,NA,JACK,CA

file 2:

ID,city,name,location
2,NY,JERRY,NY

output:

Id,city,name,location
1,NA,JACK,CA
2,NY,JERRY,NY

Currently I am using this code:

ls *.csv | xargs -n 1 tail -n+2 > master.csv

This code will merge the files perfectly , but as I need the header of the first file, this will not give me the header.

What should I do?


Solution

  • awk 'FNR==1 && NR!=1{next;}{print}' *.csv
    

    tested on solaris unix:

    > cat file1.csv
    Id,city,name ,location
    1,NA,JACK,CA
    >
    > cat file2.csv
    ID,city,name,location
    2,NY,JERRY,NY
    >
    > nawk 'FNR==1 && NR!=1{next;}{print}' *.csv
    Id,city,name ,location
    1,NA,JACK,CA
    2,NY,JERRY,NY
    > 
    

    Explanation given by kevin-d:

    FNR is the number of lines (records) read so far in the current file. NR is the number of lines read overall. So the condition 'FNR==1 && NR!=1{next;}' says, "Skip this line if it's the first line of the current file, and at least 1 line has been read overall." This has the effect of printing the CSV header of the first file while skipping it in the rest.

    Link for the difference between and