I want to tokenize my CSV in one list rather than a separate list?
with open ('train.csv') as file_object:
for trainline in file_object:
tokens_train = sent_tokenize(trainline)
print(tokens_train)
This is how I am getting the output:
['2.1 Separated of trains']
['Principle: The method to make the signal is different.']
['2.2 Context']
I want all of them in one list
['2.1 Separated of trains','Principle: The method to make the signal is different.','2.2 Context']
Since sent_tokenize()
returns a list, you could simply extend a starting list each time.
alltokens = []
with open ('train.csv') as file_object:
for trainline in file_object:
tokens_train = sent_tokenize(trainline)
alltokens.extend(tokens_train)
print(alltokens)
Or with a list comprehension:
with open ('train.csv') as file_object:
alltokens = [token for trainline in file_object for token in sent_tokenize(trainline)]
print(alltokens)
Both solutions will work even if sent_tokenize()
returns a list longer than 1.