Search code examples
windowsbatch-filedosmhtml

Batch file to remove certain string from multiple files across multiple folders


I have a folder with many subfolders each containing dozens of .mht files of different lengths. I need help creating a batch script to filter out this certain string:

"</BODY></HTML>"

I can the add lines of to the file before adding it back to the end. This is what I have so far:

setlocal enabledelayedexpansion

for /r %%a in (*.mht) do (
    for /f "skip=11" %%b in (%%a) do (if %%b=="</BODY></HTML>" del %%b)
    echo "stuff to add">>%%a
    echo "</BODY></HTML>">>%%a
)

Is there any way to fix my current script or perhaps any of you know any easier way to do this?

Note: I tried copying everything except the unwanted string to a temp file but first 11 lines contain special characters ex:| , . :


Solution

  • Update

    I have added a new script which will replace the search term with the desired code right into the line. The script can handle special characters.


    Limitations

    1. Leading Close Bracket ] characters will be trimmed from the beginning of lines. Not an issue since there should be no lines in HTML beginning with this character. (This can be fixed if needed)
    2. The percent sign % character cannot be used in either the search term or replacement term.

    Notes

    1. Lines cannot contain an odd number of double quotations " so I double the double quotations "" to ensure an even number. This means that if you have quotations in either of the strings, they must as well be doubled!
    2. To use the script, just replace the search term and replacement term with what you want on the following line of code.

      set "_=%_:search term=replacement term%"

    New Script.bat

    @echo off
    setlocal EnableExtensions DisableDelayedExpansion
    
    :: Limitations
    :: 1. Any leading close bracket ] characters will be trimmed due to delims=].
    
    for /r %%F in (html.txt) do if exist "%%~fF" (
        for /f "tokens=1,* delims=]" %%K in ('type "%%~fF" ^| find /n /v ""') do (
            set "_=%%L"
            call :Expand_
        )
    )
    goto End
    
    
    :Expand_
    :: NULL is a blank line or line with only a close bracket ].
    if not defined _ echo. & goto :eof
    :: Ensure even number of double quotation marks.
    set "_=%_:"=""%"
    :: Inject the code.
    set "_=%_:</body>=<code>To Inject</code></body>%"
    :: Escape batch special characters.
    set "_=%_:^=^^%"
    set "_=%_:<=^<%"
    set "_=%_:>=^>%"
    set "_=%_:&=^&%"
    set "_=%_:|=^|%"
    :: Revert quotations.
    set "_=%_:""="%"
    :: Display
    echo(%_%
    goto :eof
    
    
    :End
    endlocal
    pause >nul
    

    Original

    This should do what you want. No Delayed Expansion needed. Should support all special characters.

    Limitations

    1. Leading Close Bracket ] characters will be trimmed. Not an issue since there should be no lines in HTML beginning with the close bracket character. (This can be fixed if needed.)
    2. The percent sign % character cannot be used in either the search term or replacement term.

    Notes

    1. Lines cannot contain an odd number of double quotations " so I double the double quotations "" to ensure an even number. This means that if you have quotations in the string to match, they must as well be doubled. (Does not apply to your scenario)
    2. Delayed Expansion cannot be used around this line for /f %%S in ('echo "%xLine%"^| find /i "</body>"') do ( else ! exclamation marks will cause an issue.

    Script.bat

    @echo off
    setlocal EnableExtensions
    
    for /r %%F in (*.mht) do if exist "%%~fF" (
        rem Limitation - Any leading close bracket ] characters will be trimmed.
        for /f "tokens=1,* delims=]" %%K in ('type "%%~fF" ^| find /n /v ""') do (
            set "xLine="%%L""
            call :Match
            echo(%%L>>"%%~dpF\new_%%~nF%%~xF"
        )
        rem del "%%~fF"
        rem ren "%%~dpF\new_%%~nF%%~xF" "%%~nxF"
    )
    goto End
    
    
    :Match
    setlocal EnableExtensions DisableDelayedExpansion
    rem Double the double quotations to ensure even number of double quotations.
    set "xLine=%xLine:"=""%"
    for /f %%S in ('echo "%xLine%"^| find /i "</body>"') do (
        rem Add your code to inject here.  Copy the template echo below.
        rem Note that special characters need to be escaped.
        echo Inject Code>>"%%~dpF\new_%%~nF%%~xF"
    )
    endlocal
    goto :eof
    
    
    :End
    endlocal
    pause >nul
    

    This will output the new file to new_<filename>.mht If you want to replace the old file with the new file, just remove the rem command from before the del and ren commands.