Search code examples
pythonnltktokenize

Tokenizing sentences from a txt file, and getting the "expected string or bytes-like object" error


I thought I had a really straight-forward code for opening a file, reading it, and tokenizing it into sentences.

import nltk
text = open('1865-Lincoln.txt', 'r')
tokens = nltk.sent_tokenize(text)
print(tokens)

But I just keep getting the crazy long error that ends with

TypeError: expected string or bytes-like object

Solution

  • You need a read command between open and tokens.

    fileObj = open('1865-Lincoln.txt', 'r')
    text = fileObj.read()