Search code examples
javaresthadoope-commercemicroservices

Hadoop integration with e-commerce portal


We are building a new e-commerce portal from scratch using java rest services and we are planning to use MySQL (for now, Oracle in the future). We are using ElasticSearch also. We are building this whole portal as microservices. My Questions is, do I need to take care of analytics from the beginning (like hadoop and HDFS integration) ?


Solution

  • Singular relational databases work fine, but they scale poorly. Especially for large scale web services.

    You need to measure your data ingestion volume/size to determine if you need Hadoop (more specifically HDFS) for batch analytics on top of Elasticsearch. But likely not. You can use a Standalone Apache Spark cluster to run things against Elasticseach directly.

    However, you could also use Kafka as a message bus between your JDBC compatible database as well as loading an Elasticsearch index. And Spark Streaming works great with Kafka.

    And if you want to add Hadoop into the mix, you can just pull the same data from Kafka to fill in an HDFS directory.

    There are many blogs talking about microservices communication via Kafka