Working with the characters backslash and curly brackets in regular expressions in Python

I am using regular expressions to recognize the lines containing \begin{frame} in .tex files. Below is my code:

#!/usr/bin/python

import re,sys

def isEven(num):
    res = [False,True][bool(num % 2 == 0)]
    return res

textin = open(sys.argv[1]).readlines()
nline = 0
pat = r'\b\begin{frame}\b'
for line in textin:
    line = line.strip(' ')
    #print 'Test: ',line[:13]
    if re.match(pat,line):
        print 'here'
        nline += 1
    if isEven(nline):
        print '%',line.strip('\n')
    else:
        print line.strip('\n')

This program aims to add the character '%' before the lines in the tex file if the number of frames is even. In other words, I want to comment the slides which the slide number is even.

Do you know what is the wrong in the pattern?

Solution

Look at your pattern string again:

r'\b\begin{frame}\b'

Notice it starts with '\b\b'. You mean the first one as a word boundary, the second one as part of what you want to match -- but how could re possibly guess what you mean each for?!

I don't think you need the word-boundaries, by the way -- in fact they may mess up the matching. Moreover, re.match only matches at the start; since you say "contain", as opposed to "start with", in your Q's text, you may actually want re.search.

To match a backslash, you need to double it in the pattern. And you can use a single backslash to escape punctuation, such as those braces.

So I would recommend...:

def isEven(n): return n%2 == 0

nline = 0
pat = r'\\begin\{frame\}'
with open(sys.argv[1]) as textin:
    for line in textin:
        line = line.strip()
        if re.search(pat,line):
            print 'here'
            nline += 1
        if isEven(nline):
            print '%', line
        else:
            print line

I've done a few more improvements but they're not directly relevant to your Q (e.g, use with to open the file, and loop on it line by line; strip each line of whitespace completely, once, rather than by instalments; etc -- but you don't have to use any of these:-).