TextGrid is the "segmentation" file used by Praat program. I'd like to write a parser that will then verify the data. My question is:
How would you write a parser for this format? Read it line by line or something else? Is this a known format?
File type = "ooTextFile"
Object class = "TextGrid"
xmin = 0
xmax = 93.0538775510204
tiers? <exists>
size = 3
item []:
item [1]:
class = "IntervalTier"
name = "diph"
xmin = 0
xmax = 93.0538775510204
intervals: size = 65
intervals [1]:
xmin = 0
xmax = 1.300090702947846
text = ""
intervals [2]:
xmin = 1.300090702947846
xmax = 1.5300845864661654
text = "ey_s"
intervals [3]:
xmin = 1.5300845864661654
xmax = 3.4648692624493815
text = ""
(This is then repeated to EOF, with intervals[4....n])
TextGrid parser already exists and it is a part of NLTK Toolkit. The Python file is here:
http://nltk.googlecode.com/svn/trunk/nltk_contrib/nltk_contrib/textgrid.py
Updated link: https://github.com/nltk/nltk_contrib/blob/master/nltk_contrib/textgrid.py