Search code examples
javascriptparsingsearchstring-parsing

How to create a search language?


I would like to create a simple filtering language in Javascript.

I want to search inside a collection of products, say:

product=[
   {price:10,name:"T-Shirt",category:"Clothing",published_at:"07-08-2014",size:"10x30",color:"#0000FF"},
   {price:20,name:"Chair",category:"Furniture",published_at:"09-03-2013",size:"30x30",color:"#00FF00"},
   {price:30,name:"iPhone",category:"Phones",published_at:"17-03-2014",size:"40x30",color:"#FF00FF"},
   {price:40,name:"Samsung Galaxy",category:"Phones",published_at:"12-01-2012",size:"10x60",color:"#00BBBB"},
];

With only one input of text, I would like to be able to query inside this array, for example:

  • cat:Clothing => gives back the tshirt
  • price:>15 => gives back Chair, iPhone, and Samsung Galaxy
  • name:iP => Filters trough the names and gives back iPhone
  • price:>15&&size:>35x25 => Filters trough the names and gives back iPhone

I know they are some language parsers like

but I don't know which one to choose (and why) and if it is a good idea to use one ? Any ideas ?


Solution

  • I have decided to test with JISON (Similar to BISON)

    /* description: Parses end executes mathematical expressions. */
    
    /* lexical grammar */
    %lex
    %%
    
    \s+                   /* skip whitespace */
    [0-9]+("."[0-9]+)?\b  return 'NUMBER'
    ">"                     return '>'
    "price:"              return 'PRICE'
    <<EOF>>               return 'EOF'
    "name:"               return 'NAME'
    [a-z]+                return 'STRING'
    ","                   return 'COMMA'
    
    
    /lex
    
    /* operator associations and precedence */
    
    %left 'COMMA'
    %left 'PRICE' 'NAME'
    
    
    %start expressions
    
    %% /* language grammar */
    
    expressions
        : e EOF
            { typeof console !== 'undefined' ? console.log($1) : print($1);
              return $1; }
        ;
    
    e
        : 'PRICE' '>' e
            {$$ = {price:{gt:$3}};}
        | 'NAME' e
            {$$ = {name:{contains:$2}};}
        | NUMBER
            {$$ = Number(yytext);}
        | STRING
            {$$ = String(yytext);}
        | e 'COMMA' e
            { for (var attrname in $1) { $3[attrname]=$1[attrname]; $$ = $3; }}
        ;
    

    The result of the parser is the following:

    price:>30 => { price: { gt: 30 } }

    name:blabla,price:>10 => { price: { gt: 10 }, name: { contains: 'blabla' } }

    name:test => { name: { contains: 'test' } }

    This solution seems a bit more portable to me, because it doesn't directly deal with the parsing.