Search code examples
pythonpython-3.xstringsubscript

Adding subscript formatting to all numbers in a string


I'm trying to write a simple script that iterates through an input string and converts all numbers in the string to subscripted numbers.

Here is my latest attempt that iterates over an input string item and should create a new string containing the subscripted numbers in place of the numbers in the original string. Maybe it's not possible, but I can't seem to combine Unicode and format string literal expressions to make this work.

item= 'H2O'
new=[]

sub = u'\u208'

for i,x in enumerate(item):
    if x.isdigit():
        sub=u'{x}'.format(sub)
        new.append(sub)
    else:
        new.append(x)
new=''.join(new)

new

I get the following error:

File "<ipython-input-48-1d7d4a7394db>", line 4
    sub = u'\u208'
         ^
SyntaxError: (unicode error) 'unicodeescape' codec can't decode bytes in position 0-4: truncated \uXXXX escape

In the end, I'd like to do the following "conversion" to get a "number-subscripted" version (H₂O) of an input string (H2O):

H2O --> H₂O

Any thoughts on what I'm doing wrong or if there might be a better way to do this? Thanks!


Solution

  • You can use str.maketrans().

    u'\u2080' to u'\u2089' represent numbers from 0 to 9

    sub=str.maketrans("0123456789", "₀₁₂₃₄₅₆₇₈₉")
    _str='C3H8O3'
    _str=_str.translate(sub)
    print(_str)
    

    output

    'C₃H₈O₃'
    

    In your code sub=u'\u208' should be sub=u'\u2082'. Simple replace would have been enough.

    _str='H2O'
    sub=u'\u2082'
    for char in _str:
        if char.isdigit():
            _str=_str.replace(char,sub)
    print(_str)
    

    'H₂O'
    

    Building normal values to subscript values dictionary.

    sub=u'\u2080'
    norm_to_sub={}
    for norm in '0123456789':
        norm_to_sub[norm]=sub
        sub=chr(ord(sub)+1)
    
    print(norm_to_sub)
    

    {'0': '₀', '1': '₁', '2': '₂', '3': '₃', '4': '₄', '5': '₅', '6': '₆', '7': '₇', '8': '₈', '9': '₉'}
    

    As suggested by wjandrea you can do this.

    sub = 0x2080
    norm_to_sub={}
    for norm in range(10):
        norm_to_sub[norm] = ord(sub + norm)
    

    {'0': '₀', '1': '₁', '2': '₂', '3': '₃', '4': '₄', '5': '₅', '6': '₆', '7': '₇', '8': '₈', '9': '₉'}
    

    You can even create a function.

    def change_to_sub(number):
        sub=0x2080
        return ''.join(chr(sub+int(num)) for num in str(number))
    
    print(change_to_sub(1232454353654))
    

    '₁₂₃₂₄₅₄₃₅₃₆₅₄'