Search code examples
pythonpython-re

What is the difference between .* and .*? in a regular expression?


I am trying to learn about regular exprssoins. While investigating the difference between re.match and re.search I saw a (disputed) claim that re.match('(.*?)word(.*?)',string) was faster than re.search("word",string) I do not see the difference between .*? and .* nor do I see a need for the trailing (.*?) .


Solution

  • See the documentation. That ? makes * non-greedy, i.e., it'll try to match as few repetitions as possible instead of as many as possible.

    In your example re.match('(.*?)word(.*?)',string), that means as few leading . as possible, so try to find the earliest word instead of the last. The trailing (.*?) is indeed pointless.