Search code examples
pythonarraystokenize

Getting list of string array into separate string arrays in python


This is my code.

SENTENCE = "He sad might have lung cancer. It’s just a rumor."
sent=(sent_tokenize(SENTENCE))

The output is

['He sad might have lung cancer.', 'It’s just a rumor.']

I want to get this array as

['He sad might have lung cancer.']
['It’s just a rumor.']

Is their any way of doing this and if so how?


Solution

  • Since you want to split according to a sentence, you can simply do this:

    sentence_list = SENTENCE.split('.')
    for sentence in sentence_list:
        single_sentence = [sentence + '.']
    

    If you actually want all lists containing a single sentence in the same data structure, you'd have to use a list of lists or a dictionary:

    my_sentences = []
    
    sentence_list = SENTENCE.split('.')
    for sentence in sentence_list:
        my_sentences.append([sentence + '.'])
    

    To shorten this out using a list comprehension:

    my_sentences = [[sentence + '.'] for sentence in SENTENCE.split('.')]
    

    with the only culprit being that the SENTENCE splitting part will happen more often so it'll be slower working with a massive amount of sentences.