Search code examples
pythonlistmarkupmultiple-choice

Python - Multiple choice markup parsing


Consider this text:

Would you like to have responses to your questions sent to you via email ?

I'm going to propose multiple choices for several words by marking up them like that:

Would you like [to get]|[having]|g[to have] responses to your questions sent [up to]|g[to]|[on] you via email ?

The choices are bracketted and separated by pipes
The good choice is preceded by a g

I would like to parse this sentence to get the text formatted like that:

Would you like __ responses to your questions sent __ you via email ?

With a list like:

[
  [
    {"to get":0},
    {"having":0},
    {"to have":1},
  ],
  [
    {"up to":0},
    {"to":1},
    {"on":0},
  ],
]

Is my markup design ok ?
How to regex the sentence to get the needed result and generate the list ?

edit: User oriented markup language needed


Solution

  • I will suggest my solution too:

    Would you like {to get|having|+to have} responses to your questions sent {up to|+to|on} you via email ?

    def extract_choices(text):
        choices = []
    
        def callback(match):
            variants = match.group().strip('{}')
            choices.append(dict(
                (v.lstrip('+'), v.startswith('+'))
                for v in variants.split('|')
            ))
            return '___'
    
        text = re.sub('{.*?}', callback, text)
    
        return text, choices
    

    Lets try it:

    >>> t = 'Would you like {to get|having|+to have} responses to your questions    sent {up to|+to|on} you via email?'
    >>> pprint.pprint(extract_choices(t))
    ... ('Would you like ___ responses to your questions sent ___ you via email?',
    ... [{'having': False, 'to get': False, 'to have': True},
    ...  {'on': False, 'to': True, 'up to': False}])