Trying to run bcp
from the mssql-tools
package (using centos7) to export tsv file data from an HDFS location mounted to local FS via NFS Gateway, but running into errors like...
SQLState = S1000, NativeError = 0
Error = [Microsoft][ODBC Driver 17 for SQL Server]Unable to open BCP error-file
or
SQLState = S1000, NativeError = 0
Error = [Microsoft][ODBC Driver 17 for SQL Server]Unable to open BCP host data-file
The bcp command being run looks like...
/opt/mssql-tools/bin/bcp "$TABLE" in \
"$filename" \
$TO_SERVER_ODBCDSN \
-U $USER -P $PASSWORD \
-d $DB \
$RECOMMEDED_IMPORT_MODE \
-t "\t" \
-e ${filename}.bcperror.log
# with the actual commmand w/ variables resolved looks like...
/opt/mssql-tools/bin/bcp "ACCOUNT" in \
"/HDFS_NFS/path/to/tsv/1_0_0.tsv" \
-D -S MyMSSQLServer \
-U myuser -P mypassword \
-d SOME_MSSQL_DB \
-c \
-t \t \
-e /HDFS_NFS/path/to/store/errlogs/1_0_0.tsv.bcperror.log
all of this seems fine to me, yet also sometimes getting errors like...
/opt/mssql-tools/bin/bcp: unknown option
usage: /opt/mssql-tools/bin/bcp {dbtable | query} {in | out | queryout | format} datafile ...
so not sure what that's about either. My /etc/odbc.ini
file looks like...
[MyMSSQLServer]
Driver=ODBC Driver 17 for SQL Server
Description=My MS SQL Server
Trace=No
Server=<the server's IP>
Anyone know any further debugging tips or fixes for this?
The problem appears to be that the error logging file specified by the -e
option was already existing in the location specified and that HDFS (mounted via NFS or not) did not like the bcp command trying to overwrite it. You would normally do something like
hadoop fs -put -f /some/local/file /hdfs/location/for/file
and I assume that bcp was attempting something else via the NFS gateway that was not this. I suppose there also could have been latency problems with bcp accessing the HDFS NFS location. Running the bcp command without the -e
option worked in the example originally posted.
** As a workaround, based on another SO post, I bring the files down (hadoop fs -get ...
) to a local temp dir /home/user/tmp/<some uuid>/
and do what needs to be done there, then hadoop fs -put ...
.