Extracting username in IRC logs with regex?

I'm learning programming as best as I can, and I've been starting with Python. I am currently writing an IRC statistics generator (as if there weren't enough of those already), and I am trying to come up with a regex that matches the username (and only the username) in a particular log format. However, the one I have doesn't match anything with re.search.

Here is an example of the log format:

may 01 14:04:54 <FishCream> Wahoo!
may 01 14:05:01 <LpSamuelm> Oh, if only talking was this fun in real life.
jan 01 00:00:00 <Username>  Message goes here.
jan 01 00:00:00 *   Username Action goes here.

Here are the compile statements:

findusername = re.compile("^[a-zA-Z]+\s[0-9]+\s[0-9:]\s<([A-Za-z]+)>")
finduseraction = re.compile("^[a-zA-Z]+\s[0-9]+\s[0-9:]\s\*\s+([A-Za-z]+)\s")

As you can see, I have made two separate statements for finding the username when the user talks and when they use /me commands - making one super-regex for these two is probably possible, but I've got enough headache as it is.

Can anyone help me identify the problem?

Solution

Your [0-9:] class only matches one character, not the 8 that are there; add a quantifier:

findusername = re.compile("^[a-zA-Z]+\s[0-9]+\s[0-9:]{8}\s<([A-Za-z]+)>")
finduseraction = re.compile("^[a-zA-Z]+\s[0-9]+\s[0-9:]{8}\s\*\s+([A-Za-z]+)\s")

This presumes that you have each time entry on a separate line; add the re.MULTILINE flag if your log text comprises of multiple lines at a time.

A demo using the re.MULTILINE flag with .findall() on your input example:

>>> findusername = re.compile("^[a-zA-Z]+\s[0-9]+\s[0-9:]{8}\s<([A-Za-z]+)>", re.MULTILINE)
>>> finduseraction = re.compile("^[a-zA-Z]+\s[0-9]+\s[0-9:]{8}\s\*\s+([A-Za-z]+)\s", re.MULTILINE)
>>> findusername.findall(logs)
['FishCream', 'LpSamuelm', 'Username']
>>> finduseraction.findall(logs)
['Username']