Search code examples
nlpspacypart-of-speech

What are some examples of the part-of-speech tag "list-item-marker"?


What are some example sentences that include a word that would be tagged as LS (list item marker)?

This tag is used in spacy and seems to come from UPENN:

UPENN: https://www.ling.upenn.edu/courses/Fall_2003/ling001/penn_treebank_pos.html

Spacy: https://github.com/explosion/spaCy/blob/master/spacy/glossary.py#L68


Solution

  • Lists of items with periods is the way I've seen LS items. Here's an example:

    import spacy
    nlp=spacy.load('en_core_web_sm')
    doc= nlp('''The system shall:
    1) Reprogram software.
    2) Reprogram data.
    3) Read/write to memory.
    4) Lock/unlock flash memory.
    5) Clear memory.
    6) Range check.''')
    print('\n....')
    
    print("Token Attributes: \n", "token.text, token.pos_, token.tag_, token.dep_, token.orth_")
    for token in doc:
        # Print the text and the predicted part-of-speech tag
        print("{:<12}{:<12}{:<12}{:<12}{:<12}".format(token.text, token.pos_, token.tag_, token.dep_, token.orth_))
    

    Outputs:

    ....
    Token Attributes: 
     token.text, token.pos_, token.tag_, token.dep_, token.orth_
    The         DET         DT          det         The         
    system      NOUN        NN          nsubj       system      
    shall       VERB        MD          ROOT        shall       
    :           PUNCT       :           punct       :           
    
               SPACE       _SP                     
    
    1           PUNCT       LS          dep         1           
    )           PUNCT       -RRB-       punct       )           
    Reprogram   PROPN       NNP         compound    Reprogram   
    software    NOUN        NN          dobj        software    
    .           PUNCT       .           punct       .           
    
               SPACE       _SP                     
    
    2           PUNCT       LS          dobj        2           
    )           PUNCT       -RRB-       punct       )           
    Reprogram   PROPN       NNP         compound    Reprogram   
    data        NOUN        NNS         dobj        data        
    .           PUNCT       .           punct       .           
    
               SPACE       _SP                     
    
    3           PUNCT       LS          ROOT        3           
    )           PUNCT       -RRB-       punct       )           
    Read        VERB        VB          dep         Read        
    /           SYM         SYM         punct       /           
    write       VERB        VBP         ROOT        write       
    to          ADP         IN          prep        to          
    memory      NOUN        NN          pobj        memory      
    .           PUNCT       .           punct       .           
    
               SPACE       _SP                     
    
    4           PUNCT       LS          ROOT        4           
    )           PUNCT       -RRB-       punct       )           
    Lock        PROPN       NNP         npadvmod    Lock        
    /           SYM         SYM         punct       /           
    unlock      ADJ         JJ          compound    unlock      
    flash       NOUN        NN          compound    flash       
    memory      NOUN        NN          ROOT        memory      
    .           PUNCT       .           punct       .           
    
               SPACE       _SP                     
    
    5           PUNCT       LS          nummod      5           
    )           PUNCT       -RRB-       punct       )           
    Clear       ADJ         JJ          amod        Clear       
    memory      NOUN        NN          ROOT        memory      
    .           PUNCT       .           punct       .           
    
               SPACE       _SP                     
    
    6           PUNCT       LS          nummod      6           
    )           PUNCT       -RRB-       punct       )           
    Range       NOUN        NN          compound    Range       
    check       NOUN        NN          ROOT        check       
    .           PUNCT       .           punct       .           
    ....