Search code examples
hadoopelasticsearchsandbox

How to configure elastic search on Hadoop?


I want to configure elastic search on hadoop and hive. Elastic search is running on my local machine and Hadoop is on another machine. And i am using Sandbox HDP version 2.2 . How can i configure this ? also is there any UI provided in the Sandbox?


Solution

  • These are the steps to configure elastic search on hadoop.

    STEP 1

    create a table that u want to add the data,

    CREATE TABLE logs (type STRING, time STRING, ext STRING, ip STRING, req STRING, res INT, bytes INT, phpmem INT, agent STRING) ROW FORMAT DELIMITED FIELDS TERMINATED BY '\t';
    

    STEP 2

    Add data to the table(Here its a log file apache.log)

    LOAD DATA INPATH '/user/apache/apache.log' OVERWRITE INTO TABLE logs;
    

    STEP 3

    Add Jar file into the class path (Note that Version 2.1.0.BUILD-SNAPSHOT.jar)

    ADD JAR elasticsearch-hadoop-2.1.0.BUILD-SNAPSHOT.jar;
    

    STEP 4

    Create a table eslogs which u want to use in the localhosts elasticsearch

    CREATE EXTERNAL TABLE eslogs (time STRING, extension STRING, clientip STRING, request STRING, response INT, agent STRING)
    STORED BY 'org.elasticsearch.hadoop.hive.EsStorageHandler' TBLPROPERTIES('es.resource' = 'test/test','es.mapping.names' = 'time:@timestamp', 'es.nodes' = 'IP_ON_WHICH_ELASTICSEARCH_IS_RUNNING:9200');
    

    STEP 5

    ADD the data into the eslogs from logs table

    INSERT OVERWRITE TABLE eslogs SELECT s.time, s.ext, s.ip, s.req, s.res, s.agent FROM logs s;
    

    Can refer this link.