Search code examples
javascriptjqueryregexlawnchair

basic search ranking with regex in javascript


Currently I am using the below for search. I assume each and every term the user types must appear at least once in the article. I use the match method with regex

^(?=.*one)(?=.*two)(?=.*three).*$

with g, i, and m

At the moment I use matches.length to count the number of matches, but the behavior is not as expected. example: "one two three. one two three" would give me 2 matches, but it should really be 6.

If I do something like

(one|two|three)

then I do get 6 matches, but if I have the data:

"one two. one two"

I get 4 matches, when in reality I want it to be 0, since not every word appears at least once. I could do the first regex to check if there's at least one "match". If there is, I would subsequently use the second regex to count the real number of matches, but this would make my program run much slower than it already is. Doing this regex against 2500 json articles takes anywhere from 60 to 120 seconds as it is.

Any ideas on how to make this faster or better? Change the regex? Use search or indexOf instead of matches?


note: I'm using lawnchair db for local persistance and jquery. I package the code for phonegap and as a chrome packaged app.


Solution

  • var input = '...';
    var match = [];
    if (input.match(/^(?=.*\bone\b)(?=.*\btwo\b)(?=.*\bthree\b)/i)) {
      match = input.match(/\b(one|two|three)\b/ig);
    }
    

    Test this code here.