Search code examples
hadoopapache-kafkacloudclouderahortonworks-data-platform

BigData On Cloud ( Azure )


I have already implemented production bigdata solutions mainly on premises using Hadoop and NoSQL products but never on Cloud.

Today I need to turn into the cloud, hence i'm wandering what are the known (production and not only POCs) implementations of BigData on Cloud (mainly azure) :

  1. Full PaaS Solution : EMR/HDINSIGHT + S3/AzureBlob(or Azure Datalake) + Kenesis/Azure Event Hub
  2. Full IaaS Distributions(CDH,HDP) : Cloudera or Hortonworks On IaaS + Kafka On IaaS
  3. Hybrid PaaS + IaaS : Cold Data on S3/AzureBlob, Warm+Hot Data and commutation on IaaS Hadoop, AD as PaaS + Azure Event Hub on PaaS

Best regards


Solution

  • In addition to what have been said up, I've found many Production implementation on the Cloud using both full PAAS & IAAS solutions, one of the more mature was the Netflix one based on S3 & EMR.