So i have some sentences in a list like:
some_list = ['Joe is travelling via train.'
'Joe waited for the train, but the train was late.'
'Even after an hour, there was no sign of the
train. Joe then went to talk to station master about the
train's situation.']
Then i used nltk's Sentence tokenizer, because i want to analyse each sentence in a full sentence individually. So now the O/P looks something like this in lists of lists format :
sent_tokenize_list = [['Joe is travelling via train.'],
['Joe waited for the train,',
'but the train was late.'],
['Even after an hour,',
'there was no sign of the
train.',
'Joe then went to talk to station master about
the train's situation.']]
Now from this list of lists how can i select only the lists which have more than 1 sentence i.e 2nd and 3rd list in my example and have them in only list format as separate lists.
i.e O/P Should be
['Joe waited for the train,','but the train was late.']
['Even after an hour,','there was no sign of the train.',
'Joe then went to talk to station master about the train's situation.']
You can use len
to check the number of sentence in the list.
Ex:
sent_tokenize_list = [['Joe is travelling via train.'],
['Joe waited for the train,',
'but the train was late.'],
['Even after an hour,','there was no sign of the train.',"Joe then went to talk to station master about the train's situation."]]
print([i for i in sent_tokenize_list if len(i) >= 2])
Output:
[['Joe waited for the train,', 'but the train was late.'], ['Even after an hour,', 'there was no sign of the train.', "Joe then went to talk to station master about the train's situation."]]