Search code examples
arraysstringlistalgorithmformat

How to modify a string while preserving its original index as reference?


I'm looking for a computationally efficient way to add and remove elements from a string at will using its original index as reference. A good example I have in mind is to be able to format a text using html tags:

Given the following string Lorem Ipsum, I would like to be able to warp the word Ipsum with a b tag, such as Lorem <b>Ipsum</b>, and then delete these tags and add a new line between the two words, such as Lorem <br> Ispum, etc.

Of course, every time I'd add or remove an element, the string index is modified and I loose the ability to use the original index to modify the string. The method I came up with so far is to assign every character its original index, such as (0, 'L'), (1, 'o'), (2, 'r'), (3, 'e'), ... and then tinker in-between this original index and the one of the modified string.

This is highly inefficient, and formatting a big string, such as a book chapter, is too slow to support on-the-fly interaction with the text.

Is there a way to implement this properly ? I'm sure this is a common algorithmic problem but I guess I lack the vocabulary to look it up since I couldn't find anything so far.


Solution

  • Generally:

    1. Store the text in a data structure that divides it into blocks so that it can be more efficiently modified, like a rope.
    2. In each block, keep back-pointers to all of the references that point to text positions within the block, and adjust them whenever the content of a block changes.

    This sort of problem arises in the implementation of most text editors.