This is a spin-off from In Python, how do I split a string and keep the separators?
rawByteString = b'\\!\x00\x00\x00\x00\x00\x00\\!\x00\x00\x00\x00\x00\x00'
How can I split this rawByteString into parts using "\\!" as the delimiter without dropping the delimiters, so that I get:
[b'\\!\x00\x00\x00\x00\x00\x00', b'\\!\x00\x00\x00\x00\x00\x00']
I do not want to use [b'\\!' + x for x in rawByteString.split(b'\\!')][1:]
as that would use string.split() and is just a workaround, that is why this question is tagged with the "re" module.
You may use
re.split(rb'(?!\A)(?=\\!)', rawByteString)
re.split(rb'(?!^)(?=\\!)', rawByteString)
See a sample regex demo (the string input changed since null bytes cannot be part of a string).
Regex details
(?!^)
/ (?!\A)
/ (?<!^)
- a position other than start of string(?=\\!)
- a position not immediately followed with a backslash + !
NOTES
b
prefix is required when defining the pattern string literalr
makes the string literal a raw string literal so that we do not have to double escape backslashes and can use \\
to match a single \
in the string.See Python demo:
import re
rawByteString = b'\\!\x00\x00\x00\x00\x00\x00\\!\x00\x00\x00\x00\x00\x00'
print ( re.split(rb'(?!\A)(?=\\!)', rawByteString) )
Output:
[b'\\!\x00\x00\x00\x00\x00\x00', b'\\!\x00\x00\x00\x00\x00\x00']