How to tokenize the sample string using Regular Expression in Python?

I am new to regular expression. On top of finding out the pattern to match the following string, please also point out references and/or samples web sites.

The data string

1.  First1 Last1 - 20 (Long Description) 
2.  First2 Last2 - 40 (Another Description)

I want to be able to extract tuples {First1,Last1,20} and {First2,Last2,40} from the above string.

Solution

Thisone seems ok: http://docs.python.org/howto/regex.html#regex-howto Just skim it over, try some examples. regexpes are a little tricky (basicly a little programming language), and require some time to learn, but they are very useful to know. Just experiment and take one step at a time.

(yes, I could just give you the answer, but fish, man, teach)

...

as reqested, a solution when you don't use the split() solution: iterate over the lines, and check for each line:

p = re.compile('\d+\.\s+(\w+)\s+(\w+)\s+-\s+(\d+)')
m = p.match(the_line)
// m.group(0) will be the first word
// m.group(1) the second word
// m.group(2) will be the firstnumber after the last word.

The regexp is :<some digits><a dot>
<some whitespace><alphanumeric characters, captured as group 0>
<some whtespace><alphanumeric characters, captured as group 1>
<some whitespace><a '-'><some witespace><digits, captured as group 2>

it's a little strict, but that way you'll catch non-conforming lines.