Search code examples
pythonnltktweets

how to label sentiment polarity in big text file?


Need some help with it! Sorry if it's sound stupid. I am new to python and want to try this example....

but labeling was made manually which is hard work if I have two .txt files(pos and neg) each with 1000 tweets.

Using example above how can I use it with text files?


Solution

  • If I understood correctly, you need to figure out a way of reading text file into a Python object.

    Considering you have two text files that contain positive and negative samples (pos.txt and neg.txt) with one tweet per line:

    train_samples = {}
    
    with file('pos.txt', 'rt') as f:
        for line in f.readlines():
            train_samples[line] = 'pos'
    

    Repeat the above loop for negative tweets and you are done populating your train_samples.