Search code examples
parsingpos-taggersyntaxnet

Understanding Annotations- Syntaxnet


I built and ran Syntaxnet successfully on a set of 1400 tweets. I have difficulty in understanding what each parameter in the parsed file means. For example, I have the sentence:

Shoutout @Aetna for covering my doctor visit. Love you!

for which the parsed file contents are:

1       Shoutout        _       NOUN    NNP     _       9       nsubj   _       _
2       @       _       ADP     IN      _       1       prep    _       _
3       Aetna   _       NOUN    NNP     _       2       pobj    _       _
4       for     _       ADP     IN      _       1       prep    _       _
5       covering        _       VERB    VBG     _       4       pcomp   _       _
6       my      _       PRON    PRP$    _       8       poss    _       _
7       doctor  _       NOUN    NN      _       8       nn      _       _
8       visit.  _       NOUN    NN      _       5       dobj    _       _
9       Love    _       VERB    VBP     _       0       ROOT    _       _
10      you     _       PRON    PRP     _       9       dobj    _       _
11      !       _       .       .       _       9       punct   _       _

What exactly do each of the columns mean? Why are there blanks and numbers other than the POS tags?


Solution

  • This type of format is called CoNLL Format. There are various versions available of it. The meaning of each column is described here