I am trying to use findstr to delete lines that match the search strings found on another file. This is what I have been trying to use but it does not seem to work.
dir %ProjectDir%TypeScript\*.ts /b /s > Files.txt
findstr /v /i /g:%ProjectDir%TypeScript\strictFiles.txt Files.txt > tsFiles.txt
Edit This also does not seem to work:
dir %ProjectDir%TypeScript\*.ts /b /s | findstr /v /i /g:%ProjectDir%TypeScript\strictFiles.txt > tsFiles.txt
The short and incomplete answer is that you missed to specify the /L
switch of findstr
, which forces to do literal searches. Without it, the first search string determines whether literal search or regular expression mode is chosen. Since there are file names included in the search strings, which hold a period to separate the base name from the name extension, which is also a meta-character in regular expression mode, findstr
selects that mode most probably.
In addition, you should also provide the /X
switch to not filter out wrong items. For example, a path like D:\Data\some
would also match D:\Data\some\file.ext
when the /X
option is missing.
The long and comprehensive answer regards the fact that findstr
does not make life that easy.
Let us assume the command line...:
dir /S /B /A:-D "D:\Project\TypeScript\*.ts" > "Files.txt"
...produces a list of file paths in Files.txt
like this,...:
D:\Project\TypeScript\sample.ts D:\Project\TypeScript\restricted.ts D:\Project\TypeScript\excluded.ts D:\Project\TypeScript\not-excluded.ts D:\Project\TypeScript\ancillary.ts D:\Project\TypeScript\[special].ts D:\Project\TypeScript\data\test.ts D:\Project\TypeScript\data\confidential.ts D:\Project\TypeScript\data\arbitrary.ts D:\Project\TypeScript\data\.config.ts D:\Project\TypeScript\data\other.config.ts D:\Project\TypeScript\data.config.ts D:\Project\TypeScript\conf.ts\wrong.ts
...and the file strictFiles.txt
contains this...:
D:\Project\TypeScript\restricted.ts D:\Project\TypeScript\excluded.ts D:\Project\TypeScript\[special].ts D:\Project\TypeScript\confidential.ts D:\Project\TypeScript\data\.config.ts D:\Project\TypeScript\conf.ts
...to be filtered out from Files.txt
.
You would expect the command line...:
findstr /L /X /I /V /G:"strictFiles.txt" "Files.txt" > "tsFiles.txt"
...to return this in the output file tsFiles.txt
,...:
D:\Project\TypeScript\sample.ts D:\Project\TypeScript\not-excluded.ts D:\Project\TypeScript\ancillary.ts D:\Project\TypeScript\data\test.ts D:\Project\TypeScript\data\confidential.ts D:\Project\TypeScript\data\arbitrary.ts D:\Project\TypeScript\data\other.config.ts D:\Project\TypeScript\data.config.ts D:\Project\TypeScript\conf.ts\wrong.ts
...but it actually writes:
D:\Project\TypeScript\sample.ts D:\Project\TypeScript\not-excluded.ts D:\Project\TypeScript\ancillary.ts D:\Project\TypeScript\[special].ts D:\Project\TypeScript\data\test.ts D:\Project\TypeScript\data\confidential.ts D:\Project\TypeScript\data\arbitrary.ts D:\Project\TypeScript\data\.config.ts D:\Project\TypeScript\data\other.config.ts D:\Project\TypeScript\conf.ts\wrong.ts
The reason for this is that findstr
, although in literal search mode due to the /L
option, still detects meta-characters for the regular expression mode and allows to escape them by preceding with \
. The period .
and the opening bracket [
in the above sample content of strictFiles.txt
are such meta-characters, and both are preceded by the path separator \
, so they are considered as escaped and are therefore interpreted as .
and [
, or, in other words, the preceding \
becomes dismissed.
To work around that, you need to escape every \
in strictFiles.txt
by preceding with another \
, in order to avoid meta-characters to appear escaped to findstr
-- see this script for a possible way:
@echo off
setlocal EnableExtensions DisableDelayedExpansion
rem // Define constants here:
set "_ROOT=D:\Project\TypeScript" & rem // (path of root directory)
set "_MASK=*.ts" & rem // (file search pattern)
set "_LIST=.\Files.txt" & rem // (path to file list)
set "_EXCL=.\strictFiles.txt" & rem // (path to exclusion list)
set "_TEMP=%TEMP%\%~n0_%RANDOM%.tmp" & rem // (temporary exclusion list)
set "_FILT=.\tsFiles.txt" & rem // (path to filtered file list)
if not defined _FILT set "_FILT=con"
rem // Generate list of files:
dir /S /B /A:-D "%_ROOT%\%_MASK%" > "%_LIST%"
rem // Modify exclusion list:
rem /* replace every path separator `\` by an escaped one `\\`,
rem so no other characters can appear escaped to `findstr`: */
> "%_TEMP%" (
for /F "usebackq delims= eol=|" %%F in ("%_EXCL%") do (
set "FILE=%%F"
setlocal EnableDelayedExpansion
echo(!FILE:\=\\!
endlocal
)
)
rem // Filter out files that occur in modified exclusion list:
findstr /L /X /V /I /G:"%_TEMP%" "%_LIST%" > "%_FILT%"
rem // Clean up temporary files:
del "%_LIST%" "%_TEMP%"
endlocal
exit /B
If your exclusion list, say strictFileNames.txt
this time, holds pure file names rather than full file paths, like for example,...:
restricted.ts excluded.ts [special].ts confidential.ts .config.ts conf.ts
...the approach is slightly different, because only the last path element of the file list Files.txt
is to be taken into account. To achieve this, you need to precede every file name of the exclusion list by a path separator, again an escaped one like \\
for the aforementioned reason, in order to avoid wrong matches; for instance, file.ext
would match both D:\Data\file.ext
and D:\Data\X-file.ext
, but \file.ext
would match the former only, given that the /X
option is replaced by /E
this time.
Here is a script which accomplishes that:
@echo off
setlocal EnableExtensions DisableDelayedExpansion
rem // Define constants here:
set "_ROOT=D:\Project\TypeScript" & rem // (path of root directory)
set "_MASK=*.ts" & rem // (file search pattern)
set "_LIST=.\Files.txt" & rem // (path to file list)
set "_EXCL=.\strictFileNames.txt" & rem // (path to exclusion list)
set "_TEMP=%TEMP%\%~n0_%RANDOM%.tmp" & rem // (temporary exclusion list)
set "_FILT=.\tsFiles.txt" & rem // (path to filtered file list)
if not defined _FILT set "_FILT=con"
rem // Generate list of files:
dir /S /B /A:-D "%_ROOT%\%_MASK%" > "%_LIST%"
rem // Modify exclusion list:
rem /* precede every file with an escaped path separator `\\`,
rem so no other characters can appear escaped to `findstr`: */
> "%_TEMP%" (
for /F "usebackq delims= eol=|" %%F in ("%_EXCL%") do (
echo(\\%%F
)
)
rem // Filter out files that occur in modified exclusion list:
findstr /L /E /V /I /G:"%_TEMP%" "%_LIST%" > "%_FILT%"
rem // Clean up temporary files:
del "%_LIST%" "%_TEMP%"
endlocal
exit /B
All of the above sample file contents are chosen so that you can easily play around with them and see the differences when using the options /X
or /E
and when doubling the path separators \
or not.