Search code examples
regexelasticsearch

Find all URLs that do not start with "http" in ElasticSearch regex 1.7


I'm writing ElasticSearch (v 1.7) query to find all urls that do not start with http. But my mapping results with an empty result (while I definitely have urls not starting with http). Could you help me to fix it?

"query": {
  "regexp":{
    "url": {
      "value": "@&~(http.+)",
      "flags" : "ANYSTRING"
    }
  }
} 

Solution

  • Your query should work once you remove the flags:

    "query": {
      "regexp":{
        "url": {
          "value": "@&~(http.+)",
        }
      }
    } 
    

    Or, if you use ALL (default) as flags value:

    "query": {
      "regexp":{
        "url": {
          "value": "@&~(http.+)",
          "flags" : "ALL"
        }
      }
    } 
    

    ANYSTRING only enables the @ operator, while ~ is enabled with the COMPLEMENT flag, and & operator is enabled with the INTERSECTION flag. Basically, it is safer to go with the default value.