Search code examples
sqlelasticsearchsearchnosqlfull-text-search

Designing tags system with nosql/elastic search


I have to design a system with this schema.

{
 "documentId" : 123
 "documentType" : "paper"
 "tags" :["abc","xyz"]
 //other meta data of document
}

The queries I will be doing will be finding k popular tags, get documents by tag,add,remove,update tags and get all tags of a document. What is the optimal strategy to do this considering DB should be highly scalable. I am thinking of three solutions -

  1. Create a document in NoSql DB like MongoDB and index on tags array. So MongoDB is my primary DB
  2. Using Elastic search as primary DB and index full document. And then easily search for all queries.
  3. Using kafka with spark/storm streaming solution
  4. Designing a slow and fast pipeline in the video - https://www.youtube.com/watch?v=kx-XDoPjoHw&t=1835s (Not sure if spark works in this way only internally)

What is the optimal way to handle such cases?


Solution

  • It depends;

    • Do we need a free text search for tag system ?
    • What is the update rate ( Number of docs updated every minute).

    IMHO, If answer to Q1 is Yes and update rate is low , use ES

    If answer to Q1 is No, and the Update rate is high, you may want to consider a non-Elasticsearch solution.

    If the update rate is high and Q1 is Yes, consider a non-Elasticsearch solution ( Depends on size of your index, it is very much possible to use ES , not that it may be optimal)