Search code examples
pythonstringmappingindices

Mapping modified string indices to original string indices in Python


I'm relatively new to programming and wanted to get some help on a problem I've have. I need to figure out a way to map the indices of a string back to an original string after removing certain positions. For example, say I had a list:

original_string = 'abcdefgh'

And I removed a few elements to get:

new_string = acfh

I need a way to get the "true" indices of new_string. In other words, I want the indices of the positions I've kept as they were in original_string. Thus returning:

original_indices_of_new_string = [0,2,5,7]

My general approach has been something like this:

I find the positions I've removed in the original_string to get:

removed_positions = [1,3,4,6]

Then given the indices of new_string:

new_string_indices = [0,1,2,3]

Then I think I should be able to do something like this:

original_indices_of_new_string = []   
for i in new_string_indices:
        offset = 0
        corrected_value = i + offset
        if corrected_value in removed_positions:
            #somehow offset to correct value
            offset+=1
        else:
            original_indices_of_new_string.append(corrected_value)

This doesn't really work because the offset is reset to 0 after every loop, which I only want to happen if the corrected_value is in removed_positions (ie. I want to offset 2 for removed_positions 3 and 4 but only 1 if consecutive positions weren't removed).

I need to do this based off positions I've removed rather than those I've kept because further down the line I'll be removing more positions and I'd like to just have an easy function to map those back to the original each time. I also can't just search for the parts I've removed because the real string isn't unique enough to guarantee that the correct portion gets found.

Any help would be much appreciated. I've been using stack overflow for a while now and have always found the question I've had in a previous thread but couldn't find something this time so I decided to post a question myself! Let me know if anything needs clarification.

*Letters in the string are a not unique


Solution

  • Given your string original_string = 'abcdefgh' you can create a tuple of the index, and character of each:

    >>> li=[(i, c) for i, c in enumerate(original_string)]
    >>> li
    [(0, 'a'), (1, 'b'), (2, 'c'), (3, 'd'), (4, 'e'), (5, 'f'), (6, 'g'), (7, 'h')]
    

    Then remove your desired charaters:

    >>> new_li=[t for t in li if t[1] not in 'bdeg']
    >>> new_li
    [(0, 'a'), (2, 'c'), (5, 'f'), (7, 'h')]
    

    Then rejoin that into a string:

    >>> ''.join([t[1] for t in new_li])
    acfh
    

    Your 'answer' is the method used to create new_li and referring to the index there:

    >>> ', '.join(map(str, (t[0] for t in new_li)))
    0, 2, 5, 7