Search code examples
pythonstringwindowpython-3.6skip

slice string skip specific character


I have a string like this in python3:

ab_cdef_ghilm__nop_q__rs

starting from a specific character, based on the index position I want to slice a window around this character of 5 characters per side. But if the _ character is found it has to skip and to go to the next character. for example, considering in this string the character "i" I want to have a final string of 11 characters around the "i" skipping the _ characters all the times it occurs like outputting this:

 defghilmnop

Consider that I have long strings and I want to decide the index position where I want to do this thing. in this case index=10 Is there a command that crops a string of a specific size skipping a specific character?

for the moment what I'm able to do is to remove the _ from the string meanwhile counting the number of _ occurrences and use it to define the shift in the middle index position and finally I crop a window of the desired size but I want something more processive so if I could just jump every time he find a "_" wolud be perfect

situation B) index=13 I want to have 5 character on the left and 5 on the right of this index getting rid (abd not counting) of the _ characters so having this output:

ghilmnopqrs

so basically when the index corresponds to a character star to from it instead when the index correspond to a _ character we have to shift (to the right up to the next character to have in the end a string of 11 characters. to make long story short the output is 11 characters with the index position in the middle. if the index position is a _ we have to skip this character and consider the middle character the one close by(closer).


Solution

  • I don't think there's specific command for this, but you could build your own.

    For example:

    s = 'ab_cdef_ghilm__nop_q__rs'
    
    def get_slice(s, idx, n=5, ignored_chars='_'):
        if s[idx] in ignored_chars:
            # adjust idx to first valid on right side:
            idx = next((i for i, ch in enumerate(s[idx:], idx) if ch not in ignored_chars), None)
            if idx is None:
                return ''
    
        d = {i: ch for i, ch in enumerate(s) if ch not in ignored_chars}
        if idx in d:
            keys = [k for k in d.keys()]
            idx = keys.index(idx)
            return ''.join(d[k] for k in keys[max(0, idx-n):min(idx+n+1, len(s))])
    
    print(get_slice(s, 10, 5, '_'))
    print(get_slice(s, 13, 5, '_'))
    

    Prints:

    defghilmnop
    ghilmnopqrs
    

    In case print(get_slice(s, 1, 5, '_')):

    abcdefg
    

    EDIT: Added check for starting index equals ignored char.