Search code examples
hadoopsolrsolrcloud

What does solrcloud on hadoop mean?


I am new to distributed systems. I know that solrcloud provides distributed search capabilities. From what I know hadoop is a distributed processing framework for big data. why then do we integrate two distributed frameworks together? Is it for indexing and searching files in hdfs format? What are the advantages in using hadoop with solrcloud? If anyone could explain in detail or give me links to better understand both it would really helpful.


Solution

  • Solr is (mainly) for storage and searching, Hadoop is (mainly) for distributed processing. They solve different problems.

    The most common thing is to use Solr with HDFS to store / load its index files, either for using existing functionality in your HDFS cluster or for allowing a processed Hadoop result to be searchable through Solr.

    If you do a few searches on Google you'll find quite a few use cases, presentations and libraries available, such as LucidWorks' Hadoop integrations, Solr+Hadoop or Hortonworks' Indexing and searching data in Apache Solr.