Search code examples
pythonstringpunctuation

Remove punctuation items from end of string


I have a seemingly simple problem, which I cannot seem to solve. Given a string containing a DOI, I need to remove the last character if it is a punctuation mark until the last character is letter or number.

For example, if the string was:

sampleDoi = "10.1097/JHM-D-18-00044.',"

I want the following output:

"10.1097/JHM-D-18-00044"

ie. remove .',

I wrote the following script to do this:

invalidChars = set(string.punctuation.replace("_", ""))
a = "10.1097/JHM-D-18-00044.',"
i = -1
for each in reversed(a):
    if any(char in invalidChars for char in each):
        a = a[:i]
        i = i - 1
    else:
        print (a)
        break

However, this produces 10.1097/JHM-D-18-00 but I would like it to produce 10.1097/JHM-D-18-00044. Why is the 44 removed from the end?


Solution

  • Corrected code:

    import string
    
    invalidChars = set(string.punctuation.replace("_", ""))
    a = "10.1097/JHM-D-18-00044.',"
    i = -1
    for each in reversed(a):
        if any(char in invalidChars for char in each):
            a = a[:i]
            i = i # Well Really this line can just be removed all together.
        else:
            print (a)
            break
    

    This gives the output you want, while keeping the original code mostly the same.