Search code examples
csvfor-loopbatch-filefindstr

Batch script to join matching records in .csv file


I have a .csv file with 4 fields.

Field 1, 2, and 3 are text boxes

Field 4 is a number such as 1, 2, 3, etc.

There are multiple instances of field 1, 2, and 3 being the exact same for multiple records. In these instances I want to remove all but one of these records and add the number from the other (now removed) records to the end of the one remaining record.

To try and give an example:

I have

A,B,C,1
A,B,C,2
A,B,C,3
D,E,F,1
D,E,F,3

I Want

A,B,C,"1,2,3"
D,E,F,"1,3"

I have been looking into solutions for hours at this point and have gotten next to nowhere (I am completely new to scripting) as far as I can tell, I probably need to be using a for /f command or a findstr command, with certain conditions, but I'm really struggling on where to even start.


Solution

  • @ECHO OFF
    setlocal enabledelayedexpansion 
    
    for /f "tokens=1-3,* delims=," %%a in (t.csv) do (
       set "_%%a,%%b,%%c=!_%%a,%%b,%%c!%%d,"
    
    )
    (for /f "tokens=1,2 delims=_=" %%a in ('set _') do (
      set "D=%%b
      echo %%a,"!D:~0,-1!"
    ))>output.csv
    

    The first loop gathers the data (adding column 4 to a variable named as "column1 to 3".
    The second loop prints the variable names (= "column 1 to 3") plus the collected "columns 4", while removing the last comma and adding the quotes and redirecting the whole loops output to a file.

    Output with your sample data:

    A,B,C,"1,2,3"
    D,E,F,"1,3"