I try to catch values entered in syntax like this one name="Game Title" authors="John Doe" studios="Studio A,Studio B" licence=ABC123 url=https://example.com command="start game" type=action code=xyz78
But name
, author
, studio
, …, code
statements could appear in arbitrary and different order than the previous one.
For the moment here is my code:
import re
input_string = 'name="Game Title" authors="John Doe" studios="Studio A,Studio B" licence=ABC123 url=https://example.com command="start game" type=action code=xyz789'
ADD_GAME_PATERN = r'(?P<name>(?:"[^"]*"|\'[^\']*\'|[^"\']*))\s+' \
r'licence=(?P<licence>[a-z0-9]*)\s+' \
r'type=(?P<typeCode>[a-z0-9]*)\s+' \
r'command=(?P<command>(?:"[^"]*"|\'[^\']*\'|[^"\']*))\s+' \
r'url=(?P<url>\S+)\s+' \
r'code=(?P<code>[a-z0-9]*)\s+' \
r'studios=(?P<studios>.*)\s+' \
r'authors=(?P<authors>.*)\s+'
match = re.match(ADD_GAME_PATERN, input_string)
if match:
name = match.group('name')
code = match.group('code')
licence = match.group('licence')
type_code = match.group('typeCode')
command = match.group('command')
url = match.group('url')
studios = match.group('studios')
authors = match.group('authors')
print(f"Name: {name}")
print(f"Code: {code}")
print(f"Licence: {licence}")
print(f"Type: {type_code}")
print(f"Command: {command}")
print(f"URL: {url}")
print(f"Studios: {studios}")
print(f"Authors: {authors}")
else:
print("No correspondance founded.")
But in his current state the pattern await for the exact order of the statements.
So how to allow different and arbitrary order of statements?
I'd use a more simple pattern, and code the rest:
([^=]+)=([^\s"]+)|([^=]+)="([^"]+)"
import re
s = 'name="Game Title" authors="John Doe" studios="Studio A,Studio B" licence=ABC123 url=https://example.com command="start game" type=action code=xyz789'
p = r'([^=]+)=([^\s"]+)|([^=]+)="([^"]+)"'
print(re.findall(p, s))
[('', '', 'name', 'Game Title'), ('', '', ' authors', 'John Doe'), ('', '', ' studios', 'Studio A,Studio B'), (' licence', 'ABC123', '', ''), (' url', 'https://example.com', '', ''), ('', '', ' command', 'start game'), (' type', 'action', '', ''), (' code', 'xyz789', '', '')]
There are two types of values, for which we define four capture groups, two groups for each key and value:
([^=]+)=([^\s"]+)
: capture group 1 and 2 for the first key and first value.|
: or([^=]+)="([^"]+)"
: capture group 3 and 4 for the second key and second value.