Combining strings which have been altered

I have the following three strings:

"A randomized, prospective study of [intervention]endometrial resection[intervention] to prevent recurrent endometrial polyps in women with breast cancer receiving tamoxifen. To assess the role of endometrial resection in preventing recurrence of tamoxifen-associated endometrial polyps in women with breast cancer.
"A randomized, prospective study of endometrial resection to prevent [condition]recurrent endometrial polyps[condition] in women with breast cancer receiving tamoxifen. To assess the role of endometrial resection in preventing recurrence of tamoxifen-associated endometrial polyps in women with breast cancer.
"A randomized, prospective study of endometrial resection to prevent recurrent endometrial polyps in [eligibility]women with breast cancer receiving tamoxifen[eligibility]. To assess the role of endometrial resection in preventing recurrence of tamoxifen-associated endometrial polyps in women with breast cancer.

Is there a way to efficiently combine the three strings into one, where you can see all the annotations (between brackets) that I have made? I cannot come up with anything efficient by myself. The result should look like:

"A randomized, prospective study of [intervention]endometrial resection[intervention] to prevent [condition]recurrent endometrial polyps[condition] in [eligibility]women with breast cancer receiving tamoxifen[eligibility]. To assess the role of endometrial resection in preventing recurrence of tamoxifen-associated endometrial polyps in women with breast cancer.

Thanks in advance!

Solution

Assuming you are only adding those words+brackets immediately next to the existing words (i.e. splitting the string on space won't change the alignment, which is the case in the example). A simple solution might be to zip the split strings and keep the longest variant using max, then join back into a single string:

strings = ["A randomized, prospective study of [intervention]endometrial resection[intervention] to prevent recurrent endometrial polyps in women with breast cancer receiving tamoxifen. To assess the role of endometrial resection in preventing recurrence of tamoxifen-associated endometrial polyps in women with breast cancer.",
           "A randomized, prospective study of endometrial resection to prevent [condition]recurrent endometrial polyps[condition] in women with breast cancer receiving tamoxifen. To assess the role of endometrial resection in preventing recurrence of tamoxifen-associated endometrial polyps in women with breast cancer.",
           "A randomized, prospective study of endometrial resection to prevent recurrent endometrial polyps in [eligibility]women with breast cancer receiving tamoxifen[eligibility]. To assess the role of endometrial resection in preventing recurrence of tamoxifen-associated endometrial polyps in women with breast cancer.",
          ]

out = ' '.join([max(x, key=len) for x in zip(*map(lambda s: s.split(), strings))])

Output:

'A randomized, prospective study of [intervention]endometrial resection[intervention] to prevent [condition]recurrent endometrial polyps[condition] in [eligibility]women with breast cancer receiving tamoxifen[eligibility]. To assess the role of endometrial resection in preventing recurrence of tamoxifen-associated endometrial polyps in women with breast cancer.'

If you need something more robust, a good starting point might be to use the difflib module to compute the successive differences, keeping the longest variant in each comparison.