how to take input as text file in NLTK’s tokenize.regexp python

basically i hav text file as input to NLTK’s tokenize.regexp. how to input text file to below code:

'from nltk.tokenize import RegexpTokenizer

tokenizer = RegexpTokenizer(r'\w+')

raw = doc_a.lower() #instead of 'doc_a' i want my text file as input

tokens = tokenizer.tokenize(raw)`

Solution

Before this line:

raw = doc_a.lower() #instead of 'doc_a' i want my text file as input

add code to read doc_a from your file, like this:

with open(r'path_to\my_text_file.txt', 'r') as input:
    doc_a = input.read()

then continue with lowercasing and tokenizing.