Search code examples
csvbatch-filefile-rename

BAT file - Rename multiple .csv file according to its file content


I have multiple semicolon delimited .csv files:

sample.csv
sample1.csv
...etc.

Each has two records, a header, and data.

Example sample.csv:

record-type-cd;record-creation-dt;product-bank-swift-cd;term-deposit-id;saving-global-servicing-bank-name;product-bank-name;customer-account-iban;customer-id;term-deposit-reference;customer-id;term-deposit-reference;payin-iban;payout-iban
ACC;2023-08-18;BBLUK;ABCDEFG-3;check24toprovide;Testcase;DE335666666666666;ABCDEFG-1;2800123456720890;ABCDEFG-1;2800123456720890;DE66110000007000110910000;BE66110000007000110910100

For each file I need to rename each file according to the first, second and fourth fields of its data record.

For the above example, that would be:

ACC2023-08-18ABCDEFG-3.csv

When I use the following code it gives delimiter__.csv as the new name:

@echo off
setlocal enabledelayedexpansion

REM Specify the input .csv file name and path
set "inputFile=sample.csv"

REM Read the 2nd line of the .csv file
set /a lineNumber=0
for /f "usebackq skip=1 delims=" %%a in ("%inputFile%") do (
    set /a lineNumber+=1
    if !lineNumber! equ 1 (
        set "line=%%a"
        goto processLine
    )
)

:processLine
REM Set the delimiter used in the .csv file
set "delimiter=;"

REM Split the line into separate columns
set "colIndex=0"
for %%b in ("%line:%delimiter%=" "%") do (
    set /a colIndex+=1
    set "col!colIndex!=%%~b"
)

REM Get the required columns from the line
set "newFileName=!col1!_!col2!_!col4!.csv"

REM Rename the input file
ren "%inputFile%" "!newFileName!"

echo File renamed to: !newFileName!
pause

Solution

  • There can be used for this task:

    @echo off
    setlocal EnableExtensions DisableDelayedExpansion
    (for /F "eol=| delims=" %%B in ('dir *.csv /A-D /B 2^>nul') do call :ReadData "%%B") & goto EndBatch
    :ReadData
    for /F "usebackq skip=1 tokens=1,2,4 delims=;" %%G in (%1) do set "NewFileName=%%G%%H%%I.csv" & goto RenameFile
    :RenameFile
    if /I not %1 == "%NewFileName%" ren %1 "%NewFileName%"
    goto :EOF
    :EndBatch
    endlocal
    

    The first two command lines define completely the required execution environment with

    • turning off the command echo mode,
    • enabling the command extensions required for both FOR loops, the IF condition and goto :EOF,
    • disabling delayed variable expansion for processing correct also CSV files with ! in file name.

    There is executed next in background: %ComSpec% /c dir *.csv /A-D /B 2>nul
    The started %SystemRoot%\System32\cmd.exe runs its internal command DIR to

    • search in current directory for just files because of option /A-D (attribute not directory)
    • of which long or short file name is matched by the wildcard pattern *.csv
    • and outputs in bare format just the CSV file names because of option /B.

    The error message output on current directory containing no CSV file is suppressed by redirecting it to the device NUL by using 2>nul. The redirection operator > must be escaped in the FOR command line in the batch file to be interpreted as literal character and not as redirection operator by cmd.exe processing the batch file.

    The Windows Command Processor instance processing the batch file captures the output of the command process executed in background and FOR processes the captured lines after started cmd.exe finished and closed itself. The captured output is a list of file names loaded now into the memory of cmd.exe processing the batch file.

    It is important here to have all file names of the CSV files already loaded into memory of the command process before doing the file renames because of otherwise it could happen with a simple loop like for %%B in (*.csv) do that some CSV files are not processed at all and others are processed multiple times. The standard FOR loop instructs on each iteration the file system to return the next file name matching the wildcard pattern which is problematic on renaming files in same directory as the list of file names in file system matched by *.csv changes after each execution of command REN.

    There are used the FOR /F options eol=| and delims= to change the default end of line character ; which a CSV file can have at beginning of the file name to a vertical line which no file name can contain and the list of delimiters from default horizontal tab and normal space to an empty list of delimiters. These two options make sure that no CSV file name is ignored for further processing and the complete file name is assigned to the loop variable B, even on being something like ; sample file with leading space and semicolon.csv.

    For each file name is called the subroutine ReadData with passing the file name to the subroutine enclosed in " to process also correct file names containing a space or one of these characters &()[]{}^=;!'+,`~.

    The entire FOR loop is enclosed in a command block beginning with ( at beginning of the line and ending with matching ) after "%%B" for using on the same command line the unconditional command operator & and specify on same command line goto EndBatch to continue the batch file processing after renaming all CSV files below the line with the label EndBatch. That avoids a fall through into the subroutine after renaming all CSV files.

    It is very inefficient renaming hundreds or even thousands of CSV files by using a subroutine. But the processing of the batch file is optimized a little bit with the used code as the label ReadData is found by cmd.exe always immediately on the next line in the batch file. That reduces the file system accesses for the file renaming task.

    In the subroutine ReadData is used one more FOR /F loop which opens the CSV file, skips the heading line at top of the CSV file, reads the second line, splits it up into substrings using the semicolon as separator and assigning the first, second and fourth semicolon separated strings to the loop variables G, H and I.

    Note 1:_ The second line should never begin with a semicolon, i.e. having an empty value in first data column.

    Note 2:_ The second line should have never an empty second, third or fourth data value as well because of FOR /F interprets ;; as one delimiter and not as two delimiters.

    In other words the second data row must have always four non-empty values for the first four data columns.

    The new file name for the CSV file using the first, second and fourth data value is assigned to the environment variable NewFileName and the processing of the CSV file is exited by using on same command line additionally after the unconditional command operator & the command goto RenameFile for exiting the loop even if the CSV file has more than two lines. That results also in closing the CSV file opened by cmd.exe which is important here as the CSV file should be renamed with the next command line which is not possible on being still opened by any process including cmd.exe processing the batch file.

    A label is not possible inside a command block beginning with ( and ending with matching ) which is the reason for using a subroutine on which goto RenameFile can be used to exit the loop and continue batch file processing with the line below the label RenameFile.

    There is next checked with a case-insensitive string comparison if the current CSV file has not already the wanted file name. This condition makes it possible to run the batch file multiple times in same directory without getting an error message displayed on one or more CSV files in current directory have already the wanted file name.

    Otherwise the current CSV file is renamed to its new name on current file name being not equal the new file name. The file rename can fail if the current CSV file is opened by another process or there is already a file/folder with same name as the new file name should be for the current file in current directory. There is output an error message in this case.

    The subroutine is exited with goto :EOF resulting in returning the batch file processing to the first FOR /F loop which processes the next captured line respectively CSV file name.

    To understand the commands used and how they work, open a command prompt window, execute there the following commands, and read the displayed help pages for each command, entirely and carefully.

    • dir /?
    • echo /?
    • endlocal /?
    • for /?
    • goto /?
    • if /?
    • ren /?
    • setlocal /?

    See also: