When we import from RDBMS to HDFS using sqoop we will give target directory to store data, once the job completed we can see the filename as part-m-0000
as mapper output. Is there any way we can pass the filename in which the data will stored? Is sqoop have any option like that?
You can specify --target-dir <dir>
to tell the location of directory where all the data is imported,
In this directory, you see many part files (e.g. part-m-00000
). These part files are created by various mappers (remember -m <number>
in your sqoop import command)
Since data is imported in multiple files, how would you name each part file?
I did not see any additional benefit for this renaming.