I'm trying to build a regex to parse our syslogs. I was asked to account for each server that uses the service. I wrote a simple regex to pull out the FQDN, but it seems to be consuming too much of the line...
>>> string = "2010-12-13T00:00:02-05:00 <local3.info> suba1.suba2.example.com named[29959]: client 192.168.11.53#54608: query: subb1.subb2.example.com"
>>> regex = re.compile("\s.*?\.example\.com ")
>>> r = regex.search(string)
>>> r
<_sre.SRE_Match object at 0x896dae0bbf9e6bf0>
# Run findall
>>> regex.findall(string)
[u' <local3.info> suba1.suba2.example.com ', u' client 192.168.11.53#54608: query: subb1.subb2.example.com ']
As you can see the findall with .* is too generic and the regex ends up consuming to much.
Replacing \s
with \b
and the .*?
with \S
will do it.
>>> regex = re.compile(r'\b\S*\.example\.com')
>>> regex.findall(string)
[u'suba1.suba2.example.com', u'subb1.subb2.example.com']