Search code examples
pythonstringsubstring

How to add tag to spans of substrings in string?


Given some spans of the string.

s = "there is wall E on the way"
spans = [(0,5), (9,13), (16,18)]

The goal is to add some xml-like tags to it to produce:

<span>there</span> is <span>wall</span> E <span>on</span> the way

I've tried looping through the string and use some weird loop that pops the string out up till the previous end, then pop the span with the added tag, then repeat, i.e.

s = "there is wall E on the way"
spans = [(0,5), (9,13), (16,18)]


output = []
start, end = 0, 0
for sp in spans:
    start = sp[0]
    x.append(s[end:start])
    end = sp[1]
    x.append(f'<span>{s[start:end]}</span>')
x.append(s[end:])

output = "".join(x)

There must be a simpler way to achieve the same output without that much complicated appends. How else can I achieve the same span-tagged output?


Another example, input:

s = "there is wall E on the way"
spans = [(0,5), (16,18)]

Expected output:

<span>there</span> is wall E <span>on</span> the way

Yet another example, input:

s = "there is wall E on the way"
spans = [(0,5), (16,22)]

Expected output:

<span>there</span> is wall E <span>on the</span> way

There is no "word boundary" per-se and we should also expect spans like:

s = "there is wall E on the way"
spans = [(0,11), (16,22)]

Expected output:

<span>there is wa</span>ll E <span>on the</span> way

Solution

  • try it:

    s = "there is wall E on the way"
    spans = [(0,5), (9,13), (16,18)]
    
    
    def solution(sentence, location):
        res = list(sentence)
        for start, end in location:
            res[start], res[end] = "<span>" + res[start], "</span>" + res[end]
        return "".join(res)
    
    
    for s, l in [
        ["there is wall E on the way", [(0,5), (9,13), (16,18)]],
        ["there is wall E on the way", [(0,5), (16,18)]],
        ["there is wall E on the way", [(0,5), (16,22)]],
        ["there is wall E on the way", [(0,11), (16,22)]],
    ]:
        print(solution(s, l))
    

    OUTPUT:

    <span>there</span> is <span>wall</span> E <span>on</span> the way
    <span>there</span> is wall E <span>on</span> the way
    <span>there</span> is wall E <span>on the</span> way
    <span>there is wa</span>ll E <span>on the</span> way