I am new to regular expression. On top of finding out the pattern to match the following string, please also point out references and/or samples web sites.
The data string
1. First1 Last1 - 20 (Long Description)
2. First2 Last2 - 40 (Another Description)
I want to be able to extract tuples {First1,Last1,20} and {First2,Last2,40} from the above string.
Thisone seems ok: http://docs.python.org/howto/regex.html#regex-howto Just skim it over, try some examples. regexpes are a little tricky (basicly a little programming language), and require some time to learn, but they are very useful to know. Just experiment and take one step at a time.
(yes, I could just give you the answer, but fish, man, teach)
as reqested, a solution when you don't use the split() solution: iterate over the lines, and check for each line:
p = re.compile('\d+\.\s+(\w+)\s+(\w+)\s+-\s+(\d+)')
m = p.match(the_line)
// m.group(0) will be the first word
// m.group(1) the second word
// m.group(2) will be the firstnumber after the last word.
The regexp is :<some digits><a dot>
<some whitespace><alphanumeric characters, captured as group 0>
<some whtespace><alphanumeric characters, captured as group 1>
<some whitespace><a '-'><some witespace><digits, captured as group 2>
it's a little strict, but that way you'll catch non-conforming lines.