I know a similar question has been asked by me but wanted to modify it a bit to account for a new specific use case.
I have a string such as SIT,UAT
call it a1, a2
where a1
and a2
can be any sequence of characters separated by a ,
. There can also be any number of unique elements such as an a3
and a4
. These a1
and a2
(up to aN
) elements will only ever occur once in each a1, a2
combination.
I need a python regex that will allow me to check whether only (SIT
and UAT
) exist in a particular string separated by ,
if there is more than 1 element in the inputted list.
Scenarios:
Input 1: SIT,UAT
SIT,UAT
- should match with regexUAT,SIT
- should match with regexSIT
- should fail as both SIT and UAT not present togetherUAT
- should fail as both SIT and UAT not present togetherTRA,SIT,UAT
- should fail as only SIT and UAT must be present together with no other elements as TRA was not provided in the input listThanks in advance!
The regular expression you probably want to use here is:
^(?:SIT,UAT|UAT,SIT)$
Sample Pandas code:
def valid(env1, env2):
pat = r'^(?:' + env1 + r',' + env2 + r'|' + env2 + r',' + env1 + r')$'
return df["col"].str.contains(pat, regex=True)
If you need to cater to more than two expected CSV values, then regex might not scale nicely. In that case, I would suggest splitting the input on comma and then using the base string functions:
inp = "TST,SIT,UAT,PROD"
vals = inp.split(",")
allowed = ["SIT", "UAT"]
output = all(p in allowed for p in vals)
print(output) # False, because the input has TST and PROD