I want to split query string which might have multiple delimiter.
My strings are,
book_id=123&start_date>=2023-09-12&end_date<=2023-09-14&status!=return
We have 6 delimiters, =, !=, >, <, >=, <=
I want to split string if any one of the above delimiter were found in a string.
First, I split string using '&' then i tried,
re.findall('[^=!=><>=<=\s]+|[=!=><>=<=]', 'sample!=2023-09-01')
and i got this, ['sample', '!', '=', '2023-09-01']
re.findall('[^=!=><>=<=\s]+|[=!=><>=<=]', 'sample=2023-09-01')
['sample', '=', '2023-09-01']
response is uneven.
I want key, delim, val = ['sample', '!=', '2023-09-01']
instead of
['sample', '!', '=', '2023-09-01']
.
My previous regex expression was, re.split('[=|!=|>|<|>=|<=]', param)
this also gave me same result.
I referred: regex split by multiple delimiter
If the key value pairs are separated with a &
symbol, I suggest splitting the string with that symbol first, and then use re.split
with a (!?=|[><]=?)
pattern that matches and captures into Group 1 all the six operators you listed:
[re.split(r'(!?=|[><]=?)', x) for x in text.split('&')]
See the regex demo. Note that re.split
keeps all the captured substrings. Pattern details:
!?=
- an optional !
and then a =
|
- or[><]=?
- <
, >
or <=
or >=
.See also a Python demo:
import re
text = "book_id=123&start_date>=2023-09-12&end_date<=2023-09-14&status!=return"
print( [re.split(r'(!?=|[><]=?)', x) for x in text.split('&')] )
Output:
[['book_id', '=', '123'], ['start_date', '>=', '2023-09-12'], ['end_date', '<=', '2023-09-14'], ['status', '!=', 'return']]