Search code examples
hadoopsqoopsqoop2

Sqoop speculative execution


I have below question in Sqoop ?

  • I was curious if we can set speculative execution off/on for a sqoop import/export job.
  • And also do we have any option of setting number of reducers in sqoop import/export process. According to my analysis sqoop will not require any reducers, but not sure if Im correct. Please correct me on this.
  • I have used sqoop with mysql, oracle and what other databases can we use other than above.

Thanks


Solution

  • 1) In sqoop by default speculative execution is off, because if Multiple mappers run for single task, we get duplicates of data in HDFS. Hence to avoid this decrepency it is off.

    2) Number of reducers for sqoop job is 0, since it is merely a job running a MAP only job that dumps data into HDFS. We are not aggregating anything.

    3) You can use Postgresql, HSQLDB along with mysql, oracle. How ever the direct import is supported in mysql and Postgre.