Concurrency in Sqoop

I have read documents where it is recommended to install sqoop on edgenode for many reasons which is understood and for every mapper a connection to source database is established. My question is will all the 4 connections be established from edgenode or sqoop-client in edgenode just creates some kind of driver which monitors the ingestion while datanodes connect to the databases,get the data(part) and split it locally and then put in HDFS.

Solution

Sqoop is a wrapper over Map reduce to perform import export operation.

Mappers will run in your cluster , while the sqoop client will run the edge node.
Each mapper will open a connection to your database.
What rows are consumed by your mapper are decided by the client when submitting the job.

how to convert date 2017-sep-12 To 2017-09-12 in HIVE
pySpark Hadoop AWS s3 requester-pays.enabled config doesn't work
HBase Shell - org.apache.hadoop.hbase.ipc.ServerNotRunningYetException: Server is not running yet
UnsatisfiedLinkError while writing to S3 using Staging S3A Committer on Windows
Why do I need to source bash_profile every time
Apache Spark: Get number of records per partition
Unable to exit Hive
can Configuration.set be used in the Mapper?
Loading Files in UDF
Error: `callbackHandler` may not be null when connecting to HDFS using Kerberos in Jakarta EE
how to tune out of memory exception spark
Can't connect from Spark to S3 - AmazonS3Exception Status Code: 400
How to delete and update a record in Hive
What is Google's Dremel? How is it different from Mapreduce?
how to set "api-version" dynamically in fs.azure.account.oauth2.msi.endpoint
NoClassDefFoundError: org/apache/parquet/conf/ParquetConfiguration
java.lang.NoClassDefFoundError: org/apache/hadoop/fs/StorageStatistics
Missing PutHDFS Processor in Apache NiFi 2.0.0
Apache Nifi: PutHDFS Processor issue - PutHDFS Failed to write to HDFS java.lang.NoClassDefFoundError: org/apache/hadoop/conf/Configurable
how to check which HDFS datanode ip is returned by namenode to spark?
How to use hadoop with laravel 5.2
java.lang.UnsupportedOperationException: 'posix:permissions'
What is the principle of "code moving to data" rather than data to code?
java.lang.RuntimeException: Unable to instantiate org.apache.hadoop.hive.ql.metadata.SessionHiveMetaStoreClient
How to understand the result of yarn queue status
Initial job has not accepted any resources; check your cluster UI to ensure that workers are registered and have sufficient resources
connect to host localhost port 22: Connection refused
Where is yarn.nodemanager.log-dirs in spark?
How to change date format in hive?
Parquet without Hadoop?