How to tokenize a sentence based on maximum number of words in Elasticsearch?

I have a string like "This is a beautiful day" What tokenizer or what combination between tokenizer and token filter should I use to produce output that contains terms that have a maximum of 2 words? Ideally, the output should be: "This, This is, is, is a, a, a beautiful, beautiful, beautiful day, day" So far, I have tried all built-in tokenizer, the 'pattern' tokenizer seems the one I can use, but I don't know how to write a regex pattern for my case. Any help?

Solution

Seems that you're looking for shingle token filter it does exactly what you want.

Constrain a Specman list so it doesn't have identical values in consecutive elements
How to type cast a list of uint to a list of vr_ahb_data in Specman?
Static fields/methods in e
What is the difference between deep_copy and gen keeping in Specman?
Specman UVM: What is the difference between write_reg { .field == 2;}; and write_reg_fields?
Does Specman support optional parameters to a method?
Specman e: When colon equal sign ":=" should be used?
Specman e: Is there a way to know how many values there is in an enumerated type?
Specman e error: No match for file when using "for each line in file"
How to run e file one by one? Not in parallel test
Specman/e constraint (for each in) iteration
Specman e: How driver's items queue can be locked from a sequence?
Specman e: How to print variable's address?
Specman e: "keep type .. is a" fails to refine the type of a field
Specman soft select on variable, decimal vs. hexadecimal values
Specman e: How to print a pointer to a struct?
Specman e: Is there a way to print unit instance name?
Specman e: simultaneous events error
Specman e coverage: ignored values appear in the coverage statistics
Specman e: How a sequence should be started when gen_and_start_main constrained to FALSE?
Specman/e list of lists (multidimensional array)
Does Specman e have struct constructor?
Specman e: How the predefined sequence.item should be used?
Specman e: define-as-computed macro error
Specman e UVM: Why to inherit from uvm_* units?
Specman e: A sequence drives its BFM also its MAIN was not defined in a test
e HVL (IEEE 1647): expect expression fails unexpectedly
Specman e subtyping: How to refer to FALSE value of conditional field in when/extend subtyping?
e HVL (IEEE 1647): How to set 'X' value?
Difference between declaring an event that is sensitive to a simple_port value and event_port