Search code examples
batch-filecmdalphanumericnon-alphanumeric

Removing non alphanumeric characters in a batch variable


In batch, how would I remove all non alphanumeric (a-z,A-Z,0-9,_) characters from a variable?

I'm pretty sure I need to use findstr and a regex.


Solution

  • The solutionof MC ND works, but it's really slow (Needs ~1second for the small test sample).

    This is caused by the echo "!_buf!"|findstr ... construct, as for each character the pipe creates two instances of cmd.exe and starts findstr.

    But this can be solved also with pure batch.
    Each character is tested if it is in the map variable

    :test
    
        set "_input=Th""i\s&& is not good _maybe_???"
        set "_output="
        set "map=abcdefghijklmnopqrstuvwxyz 1234567890"
    
    :loop
    if not defined _input goto endLoop    
    for /F "delims=*~ eol=*" %%C in ("!_input:~0,1!") do (
        if "!map:%%C=!" NEQ "!map!" set "_output=!_output!%%C"
    )
    set "_input=!_input:~1!"
        goto loop
    
    :endLoop
        echo(!_output!
    

    And it could be speed up when the goto loop is removed.
    Then you need to calculate the stringLength first and iterate then with a FOR/L loop over each character.
    This solution is ~6 times faster than the above method and ~40 times faster than the solution of MC ND

    set "_input=Th""i\s&& is not good _maybe_!~*???"
    set "_output="
    set "map=abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ 1234567890"
    %$strLen% len _input
    
    for /L %%n in (0 1 %len%) DO (
        for /F "delims=*~ eol=*" %%C in ("!_input:~%%n,1!") do (
            if "!map:%%C=!" NEQ "!map!" set "_output=!_output!%%C"
        )
    )
    exit /b
    

    The macro $strlen can be defined with

    set LF=^
    
    
    ::Above 2 blank lines are required - do not remove
    @set ^"\n=^^^%LF%%LF%^%LF%%LF%^^":::: StrLen pResult pString
    set $strLen=for /L %%n in (1 1 2) do if %%n==2 (%\n%
            for /F "tokens=1,2 delims=, " %%1 in ("!argv!") do (%\n%
                set "str=A!%%~2!"%\n%
                  set "len=0"%\n%
                  for /l %%A in (12,-1,0) do (%\n%
                    set /a "len|=1<<%%A"%\n%
                    for %%B in (!len!) do if "!str:~%%B,1!"=="" set /a "len&=~1<<%%A"%\n%
                  )%\n%
                  for %%v in (!len!) do endlocal^&if "%%~b" neq "" (set "%%~1=%%v") else echo %%v%\n%
            ) %\n%
    ) ELSE setlocal enableDelayedExpansion ^& set argv=,