Search code examples
javascriptreactjselasticsearchkaggle

how do I add Kaggle dataset into elasticsearch?


I am new to elasticsearch and I am trying to build a movie search app. For this I plan to get data from kaggle and add to my elasticsearch which I have setup locally at localhost:9200. I see this in the localhost link:

name    "bxiIZLL"
cluster_name    "elasticsearch"
cluster_uuid    "zc_JPmw4TQ2G5bvahEF6LQ"
version 
number  "5.6.14"
build_hash  "f310fe9"
build_date  "2018-12-05T21:20:16.416Z"
build_snapshot  false
lucene_version  "6.6.1"
tagline "You Know, for Search"enter code here

Now I need to add Kaggle data to this server. How can I do it? I saw somewhere the curld -XPUT command. I am not sure how that can work with Kaggle.

A follow up question - if I want to publish my app later on, how can I host the elasticsearch ?


Solution

  • In order to upload a CSV file to elasticsearch:

    1. download the file.
      1. use logstash in order to read the file using file input
      2. modify and transform the data as you need using logstash's CSV filter
      3. output logstash to elasticsearch

    For your follow up question - how can I host Elasticsearch - you can either run it by your own, in AWS EC2 for example, or use a managed service like Elastic cloud or AWS ES. good luck