Search code examples
c#regexsearchphrase

How to extract phrases and then words in a string of text?


I have a search method that takes in a user-entered string, splits it at each space character and then proceeds to find matches based on the list of separated terms:

string[] terms = searchTerms.ToLower().Trim().Split( ' ' );

Now I have been given a further requirement: to be able to search for phrases via double quote delimiters a la Google. So if the search terms provided were:

"a line of" text

The search would match occurrences of "a line of" and "text" rather than the four separate terms [the open and closing double quotes would also need to be removed before searching].

How can I achieve this in C#? I would assume regular expressions would be the way to go, but haven't dabbled in them much so don't know if they are the best solution.

If you need any more info, please ask. Thanks in advance for the help.


Solution

  • Here's a regex pattern that would return matches in groups named 'term':

    ("(?<term>[^"]+)"\s*|(?<term>[^ ]+)\s*)+
    

    So for the input:

    "a line" of text
    

    The output items identified by the 'term' group would be:

    a line
    of
    text