Search code examples
shellawkcygwin

Splitting CSV file based on a column


I have a large csv file and want to split it into smaller files based on a category (which are column B in the CSV file).

My CSV file looks like this:

Product     Category
Printer      Supplies

I’m currently using awk -F, '{print > ($2".txt")}' input.csv which works file. This is generating many text files based on each category.

I now want to remove the category field from each of the generated files (i.e. remove everything after , “comma”).

Now the format in each text file generated is Product,Category. This should become Product only.

I tried using using cut -d',' -f1 *.txt but this is not saving the result to each of the files separately.

Also is there a way to use both commands in one line? or even if there is a way with awk to split based on the category $2 but only print $1? this would save some times.

Thanks.

but I now want to go into each of the remove the category


Solution

  • If you only want your files to include the first field of each record, then do exactly what you are doing now, but print only the first field:

    awk -F, '{print $1 > ($2".txt")}'