Search code examples
pythonregexlowercase

.lower() and a regular expression in the same line?


I have a regular expression that eliminates all non-alpha characters

def genLetters(string):
  regex = re.compile('[^a-zA-Z]')
  newString = regex.sub("", string)  

If I want to make this string lowercase, I have to define a new string (since they are immutable) like
lowerString = newString.lower()
It seems dumb to me that I would have to make a second string just to do the to lower, but if I remove the A-Z from the regex, I lose any characters that are uppercase which I don't want. I just want a final product of everything lower case.

Can this be done without the lowerString, or even cooler, can it be done in one line?


Solution

  • newString = regex.sub("", string).lower()
    

    Try to think of "functions returning" as "replacing the function call with the return value of the function". For example in the above case, regex.sub is evaluated first, and you should imagine that that call is replaced by the return value:

    newString = "some String after substitution".lower()
    

    This means that you can do everything you can do to a string on the return value of regex.sub. You can also call methods on the return value of lower().

    This also means that you can do your whole function in one line!

    newString = re.compile('[^a-zA-Z]').sub("", string).lower()
    

    Although this might be less readable.

    By the way, the standard naming convention in python is not camel case but with underscores, so newString should be new_string.