Search code examples
hadoopanalyticsbigdatardbmsdatamart

rdbms & big data into a datamart?


I have an RDBMS (SQL Server/ Oracle) and a Hadoop database on the other end. Primary-key 'customer' is common in both data stores.

A few questions:

  1. Is it possible to have a datamart that can pull data from both RDBMS & Big data and produce reports? What would be a tool example?
  2. Does the datamart itself need to be a RDBMS store or can it be some in-memory stuff?
  3. Whats the best way to run data analytics in this environment?
  4. What about data visualization?

Or should I just get all data into an RDBMS data warehouse and then solve for these questions?


Solution

  • Data virtualization or data federation is what you're looking for - i.e. the ability to access a single source that will access multiple resources as needed.

    Databases usually have some limited capability in this area which lets you define external tables see for example this link for Oracle and HDFS