regex batch-file batch-processing findstr

Trying to extract a GUID from a text, using batch (findstr + regexp)

I want to isolate a specific string from a text provided in a variable, using batch, but it doesn't seem to work as intended. I may do the regexp wrong, or maybe I misunderstood the way "findstr" works.

Te specific string that I need to isolate is a GUID (which has a standard format of alphanumeric characters, arranged in groups of characters separated by a "-", like this: 8-4-4-4-12)

@echo off
setlocal enabledelayedexpansion

SET str="This is a string that has a long uuid: (UUID: 359f975d-2649-4e20-b7c0-b452aaaca4b2)"
SET rx=[a-zA-Z0-9]{8}-[a-zA-Z0-9]{4}-[a-zA-Z0-9]{4}-[a-zA-Z0-9]{4}-[a-zA-Z0-9]{12}

 FOR %%u IN ('FINDSTR /r "!rx!" "!str!"') DO ECHO %%u

endlocal

Basically, what I need is to store the GUID in a separate variable, so I can use it later on. If that can be achieved in a different manner, I'm happy to learn!

Thanks!

Solution

@ECHO Off
SETLOCAL
SET "str=This is a string that has a long uuid: (UUID: 359f975d-2649-4e20-b7c0-b452aaaca4b2)"

:: Theoretical

SET "hn=[a-f0-9]"
SET "hn4=%hn%%hn%%hn%%hn%"
SET "hn8=%hn4%%hn4%"
SET "wrx=%hn8%-%hn4%-%hn4%-%hn4%-%hn8%%hn4%"
:again
IF NOT DEFINED str ECHO notfound&GOTO done
ECHO %str%|FINDSTR /b /r /i "%wrx%">NUL
IF ERRORLEVEL 1 (
 REM did not find string
 SET "str=%str:~1%"
 GOTO again
)
SET "str=%str:~0,36%"
ECHO found "%str%"

:done

:: BFI method

SET "str=This is a string that has a long uuid: (UUID: 359f975d-2649-4e20-b7c0-b452aaaca4b2)"
SET "hn=[a-f0-9]"
SET "hn4=%hn%%hn%%hn%%hn%"
SET "hn8=%hn4%%hn4%"

:bfiagain
IF NOT DEFINED str ECHO notfound&GOTO donebfi
:: "regex" using brute-force and ignorance
ECHO %str:~0,9%|FINDSTR /b /i /r  "%hn8%-">NUL
IF ERRORLEVEL 1 GOTO bfino
ECHO %str:~9,5%|FINDSTR /b /i /r  "%hn4%-">NUL
IF ERRORLEVEL 1 GOTO bfino
ECHO %str:~14,10%|FINDSTR /b /i /r  "%hn4%-%hn4%-">NUL
IF ERRORLEVEL 1 GOTO bfino
ECHO %str:~24,12%|FINDSTR /b /i /r  "%hn4%%hn8%">NUL
:bfino
IF ERRORLEVEL 1 (
 SET "str=%str:~1%"
 GOTO bfiagain
)
SET "str=%str:~0,36%"
ECHO found "%str%"

:donebfi

GOTO :EOF

Well, not so squeezy...

Fundamentally, findstr implements a very small subset of regex. It's intended to locate a character-string in a file.

Theoretically, you could string [a-f0-9] together the requisite number of times and add in the - separators for use as the "regex", then see whether the subject string /b (begins) with such a pattern; lop off the start character if not and repeat until found or subject-string is empty.

Notes here: I believe GUID uses HEX digits only, not alphanumerics. findstr supports /i to have the comparison made case-insensitively (which shortens the individual "character-match" string). Yes - I know ^ can be used in a regex (even one from Uncle Bill's little programmers' toolset) but I prefer /b.

The only small problem with this is that it yielded an out of memory error...

So, feed it small chunks at a time, and it appears happy...

I've done no further testing, and predict stormy weather if your text-string contains characters which cmd regards as specials - the usual suspects like redirectors, % and rabbit's ears.