Search code examples
pythontextwords

Get phrases with 3 words


I have tried to figure this one out for some time now.

I want to take a large text/string and split it into phrases of 3 words, and add them to an array.

I have tried using spilt() but it dosen't work as I hoped.

What I was thinking of doinig, to get it to work:

Start with the first 3 words in the string, when I got those, I put it in an array and move 1 word and take the next 3 words and so on and so on.

Is this a bad way of doing this?

Kind regards :)


Solution

  • my_really_long_string = "this is a really long string"
    split_string = my_really_long_string.split()
    phrase_array = [" ".join(split_string[i:i+3]) for i in range(len(split_string) - 2)]
    

    The first line just represents your string.

    After that, just split on the spaces, assuming that's all you care about for defining the end of words. (@andrew_reece's comments about edge cases is highly relevant.)

    The next one iterates on the range of 0 to n-2 where n is the length of the string. It takes 3 consecutive words from the split_string array and joins them back with spaces.

    This is almost certainly not the fastest way to do things, since it has a split and a join, but it is very straightforward.

    >>> my_really_long_string = "this is a really long string"
    >>> split_string = my_really_long_string.split()
    >>> phrases = [" ".join(split_string[i:i+3]) for i in range(len(split_string) - 2)]
    >>> 
    >>> phrases
    ['this is a', 'is a really', 'a really long', 'really long string']
    >>>