Search code examples
databaseamazon-web-servicesgoogle-bigquerygoogle-cloud-sqlamazon-redshift

load data to Redshift & Bigquery directly from Hadoop/HDFS (local/on premises cluster)


Is there any way to load data to Redshift & Bigquery directly from Hadoop/HDFS (local/on premises cluster). I need to load 1TB of data to Redshift & Bigquery. So looking for efficient way to do this.

Thanks


Solution

  • You can load directly from Amazon EMR but if you're using a local Hadoop cluster then you'd have to export your data to S3 and use the COPY command to load into Redshift from there:

    Using a COPY command to load data