I want to isolate a specific string from a text provided in a variable, using batch, but it doesn't seem to work as intended. I may do the regexp wrong, or maybe I misunderstood the way "findstr" works.
Te specific string that I need to isolate is a GUID (which has a standard format of alphanumeric characters, arranged in groups of characters separated by a "-", like this: 8-4-4-4-12)
@echo off
setlocal enabledelayedexpansion
SET str="This is a string that has a long uuid: (UUID: 359f975d-2649-4e20-b7c0-b452aaaca4b2)"
SET rx=[a-zA-Z0-9]{8}-[a-zA-Z0-9]{4}-[a-zA-Z0-9]{4}-[a-zA-Z0-9]{4}-[a-zA-Z0-9]{12}
FOR %%u IN ('FINDSTR /r "!rx!" "!str!"') DO ECHO %%u
endlocal
Basically, what I need is to store the GUID in a separate variable, so I can use it later on. If that can be achieved in a different manner, I'm happy to learn!
Thanks!
@ECHO Off
SETLOCAL
SET "str=This is a string that has a long uuid: (UUID: 359f975d-2649-4e20-b7c0-b452aaaca4b2)"
:: Theoretical
SET "hn=[a-f0-9]"
SET "hn4=%hn%%hn%%hn%%hn%"
SET "hn8=%hn4%%hn4%"
SET "wrx=%hn8%-%hn4%-%hn4%-%hn4%-%hn8%%hn4%"
:again
IF NOT DEFINED str ECHO notfound&GOTO done
ECHO %str%|FINDSTR /b /r /i "%wrx%">NUL
IF ERRORLEVEL 1 (
REM did not find string
SET "str=%str:~1%"
GOTO again
)
SET "str=%str:~0,36%"
ECHO found "%str%"
:done
:: BFI method
SET "str=This is a string that has a long uuid: (UUID: 359f975d-2649-4e20-b7c0-b452aaaca4b2)"
SET "hn=[a-f0-9]"
SET "hn4=%hn%%hn%%hn%%hn%"
SET "hn8=%hn4%%hn4%"
:bfiagain
IF NOT DEFINED str ECHO notfound&GOTO donebfi
:: "regex" using brute-force and ignorance
ECHO %str:~0,9%|FINDSTR /b /i /r "%hn8%-">NUL
IF ERRORLEVEL 1 GOTO bfino
ECHO %str:~9,5%|FINDSTR /b /i /r "%hn4%-">NUL
IF ERRORLEVEL 1 GOTO bfino
ECHO %str:~14,10%|FINDSTR /b /i /r "%hn4%-%hn4%-">NUL
IF ERRORLEVEL 1 GOTO bfino
ECHO %str:~24,12%|FINDSTR /b /i /r "%hn4%%hn8%">NUL
:bfino
IF ERRORLEVEL 1 (
SET "str=%str:~1%"
GOTO bfiagain
)
SET "str=%str:~0,36%"
ECHO found "%str%"
:donebfi
GOTO :EOF
Well, not so squeezy...
Fundamentally, findstr
implements a very small subset of regex
. It's intended to locate a character-string in a file.
Theoretically, you could string [a-f0-9]
together the requisite number of times and add in the -
separators for use as the "regex", then see whether the subject string /b
(begins) with such a pattern; lop off the start character if not and repeat until found or subject-string is empty.
Notes here: I believe GUID uses HEX digits only, not alphanumerics. findstr
supports /i
to have the comparison made case-insensitively (which shortens the individual "character-match" string). Yes - I know ^
can be used in a regex
(even one from Uncle Bill's little programmers' toolset) but I prefer /b
.
The only small problem with this is that it yielded an out of memory
error...
So, feed it small chunks at a time, and it appears happy...
I've done no further testing, and predict stormy weather if your text-string contains characters which cmd
regards as specials - the usual suspects like redirectors, %
and rabbit's ears.