I have about 300 000 files in a directory. They are sequentially numbered - x000001, x000002, ..., x300000. But some of these files are missing and I need to write an output text file containing the missing file numbers. The following code does it only up to 10 000 files:
@echo off
setlocal enabledelayedexpansion
set "log=%cd%\logfile.txt"
for /f "delims=" %%a in ('dir /ad /b /s') do (
pushd "%%a"
for /L %%b in (10000,1,19999) do (
set a=%%b
set a=!a:~-4!
if not exist "*!a!.csv" >>"%log%" echo "%%a - *!a!.csv"
)
popd
)
How to extend it to 3 * 10^5 files?
If all 300000 CSV files are in current directory on executing the batch file, this batch code would do the job.
@echo off
set "log=%cd%\logfile.txt"
del "%log%" 2>nul
for /L %%N in (1,1,9) do if not exist *00000%%N.csv echo %%N - *00000%%N.csv>>"%log%"
for /L %%N in (10,1,99) do if not exist *0000%%N.csv echo %%N - *0000%%N.csv>>"%log%"
for /L %%N in (100,1,999) do if not exist *000%%N.csv echo %%N - *000%%N.csv>>"%log%"
for /L %%N in (1000,1,9999) do if not exist *00%%N.csv echo %%N - *00%%N.csv>>"%log%"
for /L %%N in (10000,1,99999) do if not exist *0%%N.csv echo %%N - *0%%N.csv>>"%log%"
for /L %%N in (100000,1,300000) do if not exist *%%N.csv echo %%N - *%%N.csv>>"%log%"
set "log="
This second solution is definitely much faster than above as it processes the list of file names in current directory from first file name to last file name.
In case of last file is not x300000.csv
, the batch code below just writes one more line into the log file with the information from which number to expected end number 300000 files are missing in current directory.
@echo off
setlocal EnableExtensions EnableDelayedExpansion
rem Delete log file before running file check.
set "log=%cd%\logfile.txt"
del "%log%" 2>nul
rem Define initial value for the number in the file names.
set "Number=0"
rem Define the file extension of the files.
set "Ext=.csv"
rem Define beginning of first file name with number 1.
set "Name=x00000"
rem Define position of dot separating name from extension.
set "DotPos=7"
rem Process list of files matching the pattern of fixed length in current
rem directory sorted by file name line by line. Each file name is compared
rem case-sensitive with the expected file name according to current number.
rem A subroutine is called if current file name is not equal expected one.
for /F "delims=" %%F in ('dir /B /ON x??????%Ext% 2^>nul') do (
set /A Number+=1
if "!Name!!Number!%Ext%" NEQ "%%F" call :CheckDiff "%%F"
)
rem Has last file not expected number 300000, log the file numbers
rem of the files missing in current directory with a single line.
if "%Number%" NEQ "300000" (
set /A Number+=1
echo All files from number !Number! to 300000 are also missing.>>"%log%"
)
endlocal
rem Exit this batch file to jump to predefined label EOF (End Of File).
goto :EOF
rem This is a subroutine called from main loop whenever current file name
rem does not match with expected file name. There are two reasons possible
rem with file names being in expected format:
rem 1. One leading zero must be removed from variable "Name" as number
rem has increased to next higher power of 10, i.e. from 1-9 to 10,
rem from 10-99 to 100, etc.
rem 2. The next file name has really a different number as expected
rem which means there are one or even more files missing in list.
rem The first reason is checked by testing if the dot separating name
rem and extension is at correct position. One zero from end of string
rem of variable "Name" is removed if this is the case and then the
rem new expected file name is compared with the current file name.
rem Is the perhaps newly determined expected file name still not
rem equal the current file name, the expected file name is written
rem into the log file because this file is missing in list.
rem There can be even more files missing up to current file name. Therefore
rem the number is increased and entire subroutine is executed once more as
rem long as expected file name is not equal the current file name.
rem The subroutine is exited with goto :EOF if the expected file name
rem is equal the current file name resulting in continuing in main
rem loop above with checking next file name from directory listing.
:CheckDiff
set "Expected=%Name%%Number%%Ext%"
if "!Expected:~%DotPos%,1!" NEQ "." (
set "Name=%Name:~0,-1%"
set "Expected=!Name!%Number%%Ext%"
)
if "%Expected%" EQU %1 goto :EOF
echo %Expected%>>"%log%"
set /A Number+=1
goto CheckDiff
For understanding the used commands in both solutions and how they work, open a command prompt window, execute there the following commands, and read entirely all help pages displayed for each command very carefully.
call /?
dir /?
echo /?
endlocal /?
for /?
if /?
goto /?
rem /?
set /?
setlocal /?