hadoop hbase bigdata data-warehouse platform

How to build a big data platform to receive and store big data in Hadoop

I am trying to build up a big data platform to receive and store in Hadoop large amount of heterogeneous data like (documents,videos,images,sensors data, etc) then implement classification process. So what architecture can help me as I’m currently using VMware VSphere EXSi Hadoop
Habse Thrift XAMPP All these working fine but I don’t know how to receive a large amount of data and how to store the data because I discovered that Hbase is a column-oriented data base and it’s not data warehouse.

Solution

You have to customize solution for type of Big Data ( Structured, Semi-Structured and Un-Structured)

You can use HIVE/HBASE for structured data if total data size <= 10 TB

You can use SQOOP to import structured data from traditional RDBMS database Oracle, SQL Server etc.

You can use FLUME for processing Un-structured data.

You can use Content Management System to process Un-structured data & Semi-Structured data - Tera Or Peta bytes of data. If you are storing un-structured data, I prefer to store the data in CMS and use meta data information in NoSQL database like HBASE

To process Big data streaming, you can use PIG.

Have a look at Structured Data and Un-Structured data handling in Hadoop