Search code examples
python-3.xlistfunctionreplacesublist

create a function to remove or replace last numbers from a list containing 2 values per sublist


I have a long list containing a lot of sublists which exist of 2 "values", for instance

test=[["AAAGG1","AAAAA22"],["GGGGA1","AAGGA"],["GGGGG23","GGAGA6"]]

What i want, is to replace or remove the last digits. Therfore i have tried using a pretty long function:

def remove_numbers(index,newlist):
for com in index:
    for dup in com:
        if "1" in dup:
            newlist.append(dup.replace("1",""))
        elif "2" in dup:
            newlist.append(dup.replace("2",""))
        elif "3" in dup:
            newlist.append(dup.replace("3",""))
        elif "4" in dup:
            newlist.append(dup.replace("4",""))
        elif "5" in dup:
            newlist.append(dup.replace("5",""))
        elif "6" in dup:
            newlist.append(dup.replace("6",""))
        elif "7" in dup:
            newlist.append(dup.replace("7",""))
        elif "8" in dup:
            newlist.append(dup.replace("8",""))
        elif "9" in dup:
            newlist.append(dup.replace("9",""))
        else:
            newlist.append(dup)

i created an empty list and called out the function

emptytest=[]
testfunction=remove_numbers(test,emptytest)

when i call out the emptytest my output is the following

['AAAGG', 'AAAAA', 'GGGGA', 'AAGGA', 'GGGGG3', 'GGAGA']

The problem is that it is now a single list and when there are two numbers in the end that are not the same, they are not all removed/replaced. I need the sublists to remain intact.

does anybody know of a solution for this?

Sorry if it is a simple question since i am not that experienced with python yet, but i couldn't find a suitable solution on the web or an existing forum.


Solution

  • What you need is to use a regex for replacing the numbers and not manually identifying everything. The whole thing can be achieved by 2 lines below.

    import re
    processed = [[re.sub(r"\d+$","",n) for n in t] for t in test]
    print(processed)
    

    Gives a result

    [['AAAGG', 'AAAAA'], ['GGGGA', 'AAGGA'], ['GGGGG', 'GGAGA']]
    

    Here we used a regex "\d+$" which basically matches a numerical pattern at end of the string. If such a pattern is identified, then we replace it with empty.