Search code examples
pythonregexmorse-code

Splitting up a string using regex


Im making a program to parse morse code. The input is in string format. Here is an example:

Input = '/.- .-.. .---- ..-. -.- - ..--- -.-. .... .....'

Desired Output = ['/', '.-', '.-..', '.----', '..-.', '-.-', '-', '..---', '-.-.', '....', '.....']

The / represents a space in the text. I would like to split the forward slash off to its own element and not get rid of it. The spaces however I do want to get rid of. Doing an expression like re.split(' |/') gets rid of the forward slash. How should I go about doing this? Thanks in advance.


Solution

  • Linguistically non-sophisticated approach using itertools.groupby and a custom key function that distinguishes '/', ' ' and rest:

    from itertools import groupby
    
    s = '/.- .-.. .---- ..-. -.- - ..--- -.-. .... .....'
    key = lambda c: c if c in ' /' else 'x'
    [''.join(g) for k, g in groupby(s, key=key) if k != ' ']
    # ['/', '.-', '.-..', '.----', '..-.', '-.-', '-', '..---', '-.-.', '....', '.....']