I've tried to find free text query inside of user queries.
Let me give an example. User enters:
domain:example.com and Welcome to my website
Currently the output will be:
>> parser.parseString("domain:example.com and Welcome to my website")
([(['domain', ':', 'example.com'], {}), 'and Welcome to my website'], {})
My pyparsing
code is:
word = pp.Word(pp.printables, excludeChars=":")
non_tag = word + ~pp.FollowedBy(":")
# tagged value is two words with a ":"
tag = pp.Group(word + ":" + word)
# one or more non-tag words - use originalTextFor to get back
# a single string, including intervening white space
phrase = pp.originalTextFor(non_tag[1, ...])
parser = (phrase | tag)[...]
free_text_search_res = parser.parseString(filters)
This is fine and works as expected. What I'm having issue with is that I need to also parse the below query correctly:
>> parser.parseString("domain:example.com and date:[2012-12-12 TO 2014-12-12] and Welcome to my website")
([(['domain', ':', 'example.com'], {}), 'and', (['date', ':', '[2012-12-12'], {}), 'TO 2014-12-12] and Welcome to my website'], {})
The date
part is wrong. I expected to be ['date', ':', '[2012-12-12 TO 2014-12-12]']
. Where I have done wrong?
You can try something like below
word = pp.Word(pp.printables, excludeChars=":")
word = ("[" + pp.Word(pp.printables+ " ", excludeChars=":[]") + "]") | word
non_tag = word + ~pp.FollowedBy(":")
# tagged value is two words with a ":"
tag = pp.Group(word + ":" + word)
# one or more non-tag words - use originalTextFor to get back
# a single string, including intervening white space
phrase = pp.originalTextFor(non_tag[1, ...])
parser = (phrase | tag)[...]
# free_text_search_res = parser.parseString(filters)
# tag.parseString("date:[2012-12-12 TO 2014-12-12]")
parser.parseString("domain:example.com and date:[2012-12-12 TO 2014-12-12] and Welcome to my website")
Will give you the below results
([(['domain', ':', 'example.com'], {}), 'and', (['date', ':', '[', '2012-12-12 TO 2014-12-12', ']'], {}), 'and Welcome to my website'], {})