Search code examples
rubyruby-on-rails-4elasticsearchelasticsearch-rails

Elasticsearch Ruby Activerecord Persistence Model URL term search


I'm trying to make a search on a field that contains URL using elastic search Term query. I use elasticsearch-rails the ActiveRecord Persistance Pattern. This is how I try to do it.

total_views = UserAction.search :query=> {
        :filtered=> {
            :filter=> {
                :term=> { action_path:"http://0.0.0.0:3000/tshirt/test" } 
            }
        }
    }  

It works if there are no '/' or ':' characters. For example when the action_path is just 'tshirt'. The other fields are not analyzed and they work if there are no '/', ':' kinds of characters in the field. So obviously elastic search tries to analyze it but the problem is they should not be analyzed because mapping is already there.

This my user action class

class UserAction
  include Elasticsearch::Persistence::Model  
  extend Calculations
  include Styles

  attribute :user_id, Integer
    attribute :user_referrer, String, mapping: { index: 'not_analyzed' } 
    attribute :user_ip, String, mapping: { index: 'not_analyzed' } 
    attribute :user_country, String, mapping: { index: 'not_analyzed' }
    attribute :user_city, String, mapping: { index: 'not_analyzed' }
    attribute :user_device, String, mapping: { index: 'not_analyzed' }
  attribute :user_agent, String, mapping: { index: 'not_analyzed' }
    attribute :user_platform
  attribute :user_visitid, Integer
    attribute :action_type, String, mapping: { index: 'not_analyzed' } 
    attribute :action_css, String, mapping: { index: 'not_analyzed' }
  attribute :action_text, String, mapping: { index: 'not_analyzed' }
  attribute :action_path, String, mapping: { index: 'not_analyzed' } 
  attribute :share_url, String, mapping: { index: 'not_analyzed' } 
  attribute :tag 
  attribute :date 

I also tried adding indexes using 'mapping do.." and then "create_index!" but result is the same. Because mapping is there it does create the mapping.

This is my gem file

   gem "elasticsearch-model", git: "git://github.com/elasticsearch/elasticsearch-rails.git", require: "elasticsearch/model"
          gem "elasticsearch-persistence", git: "git://github.com/elasticsearch/elasticsearch-rails.git", require: "elasticsearch/persistence/model"
          gem "elasticsearch-rails"

When I make the search I also see that those fields that are not analyzed.

       :reload_on_failure=>false,
         :randomize_hosts=>false,
         :transport_options=>{}},
       @protocol="http",
       @reload_after=10000,
       @resurrect_after=60,
       @serializer=
        #<Elasticsearch::Transport::Transport::Serializer::MultiJson:0x007fc4bf9e0e18
         @transport=#<Elasticsearch::Transport::Transport::HTTP::Faraday:0x007fc4bf9b35a8 ...>>,
       @sniffer=
        #<Elasticsearch::Transport::Transport::Sniffer:0x007fc4bf9e0dc8
         @timeout=1,
         @transport=#<Elasticsearch::Transport::Transport::HTTP::Faraday:0x007fc4bf9b35a8 ...>>,
       @tracer=nil>>,
   @document_type="user_action",
   @index_name="useraction",
   @klass=UserAction,
   @mapping=
    #<Elasticsearch::Model::Indexing::Mappings:0x007fc4bfab18d8
     @mapping=
      {:created_at=>{:type=>"date"},
       :updated_at=>{:type=>"date"},
       :user_id=>{:type=>"integer"},
       :user_referrer=>{:type=>"string"},
       :user_ip=>{:type=>"string"},
       :user_country=>{:type=>"string", :index=>"not_analyzed"},
       :user_city=>{:type=>"string", :index=>"not_analyzed"},
       :user_device=>{:type=>"string", :index=>"not_analyzed"},
       :user_agent=>{:type=>"string", :index=>"not_analyzed"},
       :user_platform=>{:type=>"string"},
       :user_visitid=>{:type=>"integer"},
       :action_type=>{:type=>"string", :index=>"not_analyzed"},
       :action_css=>{:type=>"string", :index=>"not_analyzed"},
       :action_text=>{:type=>"string", :index=>"not_analyzed"},
       :action_path=>{:type=>"string", :index=>"not_analyzed"}},
     @options={},
     @type="user_action">,
   @options={:host=>UserAction}>,
 @response={"took"=>1, "timed_out"=>false, "_shards"=>{"total"=>4, "successful"=>4, "failed"=>0}, "hits"=>{"total"=>0, "max_score"=>nil, "hits"=>[]}}>
(END) 

the initializer file has nothing other than the elastichq connection url.

Data is there in elastichq so I should get the results but can't get any.

    user_action 1   AUzH9xKDueQ8OtBQuyQC    http://example.org/api/analytics/track
user_actions    user_action 1   AUzIAUsvueQ8OtBQuyQg    http://0.0.0.0:3000/tshirt/funnel_test2
user_actions    user_action 1   AUzH7ay5ueQ8OtBQuyP2    http://example.org/api/analytics/track
user_actions    user_action 1   AUzH-HAdueQ8OtBQuyQU    http://0.0.0.0:3000/tshirt/test
user_actions    user_action 1   AUzIJbCGueQ8OtBQuyQ4    http://example.org/api/analytics/track
user_actions    user_action 1   AUzIJbCjueQ8OtBQuyQ5    http://example.org/api/analytics/track

Curl Results from Elastichq

curl -XGET "https://YYYYY:XXXXX@xxxx.qbox.io/user_actions/_mapping"
{
  "user_actions": {
    "mappings": {
      "user_action": {
        "properties": {
          "action_css": { "type": "string" },
          "action_path": { "type": "string" },
          "action_text": { "type": "string" },
          "action_type": { "type": "string" },
          "created_at": { "format": "dateOptionalTime", "type": "date" },
          "date": { "type": "string" },
          "share_url": { "type": "string" },
          "tag": { "type": "string" },
          "updated_at": { "format": "dateOptionalTime", "type": "date" },
          "user_agent": { "type": "string" },
          "user_city": { "type": "string" },
          "user_country": { "type": "string" },
          "user_device": { "type": "string" },
          "user_id": { "type": "long" },
          "user_ip": { "type": "string" },
          "user_referrer": { "type": "string" },
          "user_visitid": { "type": "long" }
        }
      }
    }
  }
}

can anybody help me on getting url term search work?


Solution

  • I did what I didn't want to do. Created the index with its mapping manually with the below post request so elasticsearch-rails can't create it wrong. Now everything works fine

    curl -XPOST https://xxxxxx.qbox.io/user_actions -d '{
        "settings" : {
            "number_of_shards" : 1
        },
        "mappings" : {
            "user_action" : {
                "_source" : { "enabled" : false },
                "properties" : {
                    "action_path" : { "type" : "string", "index" : "not_analyzed" }
                }
            }
        }
    }'