Search code examples
batch-filecmdfindstr

How to use CMD regex to limit output in recursive directory search


I need to create a dynamic report for a set of folders in a code repository that are inconsistently stored; some have the key feature two levels down, some three, and a few are four+ levels down.

  • The closest thing to consistency that all of the key folders I need to report on contain either "[...]d_d[...]" or "[...]dd_dd[...]" (where d is a numeric digit); these matched strings may occur in the middle or at the end, but will never occur at the beginning.
  • Once found, I need to list only the key folder and not any of its subfolders, even if one or more of its subfolders also happen to also contain either of the same patterns.
  • because of the way the recursive listing is generated, all results will have a terminating slash, but I do not want to get anything beyond that slash; I honestly don't care if the terminating slash is part of the report or not.
  • because of the inconsistencies, there may be three, four, five, or possibly more slashes in the valid output.

Here is a short example of a few instances where I need the results prefixed with +++ and need to not include the listings prefixed with ---

+++ A1B2C3/ABC_Core/ABC_HIJ_R71_00_00/
--- A1B2C3/ABC_Core/ABC_HIJ_R71_00_00/QR-HIJ-Outbound-123-Svc/SharedResources/WXYZ/Client/TU4_987_864X22Dat/
+++ A1B2C3/ABC_Core/ABC_HIJ_R72_00_00/
--- A1B2C3/ABC_Core/ABC_HIJ_R72_00_00/QR-HIJ-Outbound-123-Svc/SharedResources/WXYX/Client/TU4_987_864X22Dat/
+++ A1B2C3/ABC_Core/ABC_HIJ_R73_00_00_WidgetMod/
--- A1B2C3/ABC_Core/ABC_HIJ_R73_00_00_WidgetMod/QR-HIJ-Outbound-123-Svc/
+++ D4E5F6/QRWidgetFlow_R_1_0_0_DMND0903212-ErrorReports/

I've tried several variants on this

findstr /e /r /c:"[0-9][0-9]_[0-9][0-9][^0-9/]*/" /c:"[0-9]_[0-9][^0-9/]*/"

but each time I change it around, I either gain extra subfolders or lose key folders I had before.

Any help would be greatly appreciated.


Solution

  • @ECHO Off
    SETLOCAL ENABLEDELAYEDEXPANSION 
    
    rem The following setting for the directory is a name
    rem that I use for testing and deliberately includes spaces to make sure
    rem that the process works using such names. These will need to be changed to suit your situation.
    
    SET "sourcedir=u:\your files"
    SET "tempfile=%tmp%\afilename"
    SET "numerics=0-9"
    SET "lastkey=?"
    
    (FOR /d /r "%sourcedir%" %%e IN (*_*) DO ECHO %%e)>"%tempfile%"
    FOR /f "delims=" %%e IN ('sort "%tempfile%"') DO (
     FOR %%y IN ("%%e\.") do (
      rem does leaf pass test does not start 9_9 or 99_99 
      ECHO %%~nxy|FINDSTR /b /r /c:"[%numerics%]_[%numerics%]" /c:"[%numerics%][%numerics%]_[%numerics%][%numerics%]">NUL
      IF ERRORLEVEL 1 (
       rem does not start 9_9 or 99_99 - does it contain 9_9 or 99_99 ?
       ECHO %%~nxy|FINDSTR /r /c:"[%numerics%]_[%numerics%]" /c:"[%numerics%][%numerics%]_[%numerics%][%numerics%]">NUL
       IF NOT ERRORLEVEL 1 (
        rem contains 9_9 or 99_99
        CALL :report "%%e"
       )
      )
     )
    )
    
    DEL "%tempfile%"
    
    GOTO :EOF
    
    :report
    SET "reportme=%~1"
    SET "reportme=!reportme:%lastkey%=!"
    IF "%reportme%" neq %1 GOTO :eof
    ECHO %~1
    SET "lastkey=%~1"
    GOTO :eof
    

    Always verify against a test directory before applying to real data.

    Obtain a full subdirectory list and store it in a tempfile.

    Read each directoryname from a sorted version of the tempfile, and derive the leafname. If the leafname does not start with the target strings, but does contain one of the strings, then it's a candidate to be reported.

    The report sees whether the quoted directoryname passed as %1 contains the last-reported name. If it does, ignore it, otherwise report it and set it as the last-reported name.

    Since the names are sorted, all subdirectories of a "key" directory will follow that directory in the list.

    I believe that the strings to match may actually need to be _9_9 or _99_99 as 9_9 on its own would match "A1B2C3/ABC_Core/ABC_HIJ_R7xxxxx/QR-HIJ-Outbound-123-Svc/SharedResources/WXYZ/Client/TU4_987_864X22Dat/"