I'm trying to write a simple script that iterates through an input string and converts all numbers in the string to subscripted numbers.
Here is my latest attempt that iterates over an input string item
and should create a new string containing the subscripted numbers in place of the numbers in the original string. Maybe it's not possible, but I can't seem to combine Unicode and format string literal expressions to make this work.
item= 'H2O'
new=[]
sub = u'\u208'
for i,x in enumerate(item):
if x.isdigit():
sub=u'{x}'.format(sub)
new.append(sub)
else:
new.append(x)
new=''.join(new)
new
I get the following error:
File "<ipython-input-48-1d7d4a7394db>", line 4
sub = u'\u208'
^
SyntaxError: (unicode error) 'unicodeescape' codec can't decode bytes in position 0-4: truncated \uXXXX escape
In the end, I'd like to do the following "conversion" to get a "number-subscripted" version (H₂O
) of an input string (H2O
):
H2O --> H₂O
Any thoughts on what I'm doing wrong or if there might be a better way to do this? Thanks!
You can use str.maketrans()
.
u'\u2080'
to u'\u2089'
represent numbers from 0 to 9
sub=str.maketrans("0123456789", "₀₁₂₃₄₅₆₇₈₉")
_str='C3H8O3'
_str=_str.translate(sub)
print(_str)
output
'C₃H₈O₃'
In your code sub=u'\u208'
should be sub=u'\u2082'
. Simple replace would have been enough.
_str='H2O'
sub=u'\u2082'
for char in _str:
if char.isdigit():
_str=_str.replace(char,sub)
print(_str)
'H₂O'
Building normal values to subscript values dictionary.
sub=u'\u2080'
norm_to_sub={}
for norm in '0123456789':
norm_to_sub[norm]=sub
sub=chr(ord(sub)+1)
print(norm_to_sub)
{'0': '₀', '1': '₁', '2': '₂', '3': '₃', '4': '₄', '5': '₅', '6': '₆', '7': '₇', '8': '₈', '9': '₉'}
As suggested by wjandrea you can do this.
sub = 0x2080
norm_to_sub={}
for norm in range(10):
norm_to_sub[norm] = ord(sub + norm)
{'0': '₀', '1': '₁', '2': '₂', '3': '₃', '4': '₄', '5': '₅', '6': '₆', '7': '₇', '8': '₈', '9': '₉'}
You can even create a function.
def change_to_sub(number):
sub=0x2080
return ''.join(chr(sub+int(num)) for num in str(number))
print(change_to_sub(1232454353654))
'₁₂₃₂₄₅₄₃₅₃₆₅₄'