Input String:
However, the gene of hBD-1 and LL-27 expression was not affected by cancer in both acne and non-acne patients.
Expected Output String:
However, the gene of hBD-1 expression and LL-27 expression was not affected by cancer in both acne patients and non-acne patients.
Code:
import re
string_a = "However, the gene of hBD-1 and LL-27 expression was not affected by cancer in both acne and non-acne patients."
print(string_a)
print('\n')
output = re.sub(r'\b(\w+-(\d+|[A-Za-z]+))\b(?! [A-Za-z]+\b)', r'\b(\1 [A-Za-z]+)\b', string_a)
print(output)
I am not getting the exact output string. Please look into my code and suggest or modify the solution.
I would use re.sub
here to selectively replace any gene term with itself followed by the text expression
, for those genes who do not already have this text following it.
inp = "However, the gene of hBD-1 and LL-27 expression was not affected by acnes."
output = re.sub(r'\b(\w+-\d+)\b(?! expression\b)', r'\1 expression', inp)
print(output)
This prints:
However, the gene of hBD-1 expression and LL-27 expression was not affected by acnes.