Search code examples
hadoopsqoophanasap-bw

import sap bw over hana via sqoop


Currently I am trying to import a sap hana table with sqoop. Here I encounter the problem that both the table names and the column names contain forward slashes "/".

For the table names I can use the query option and escaping the table name as workaround. But if I want to import the table with different mappers, I want to use the -m option in combination with --split-by. Here I can't specify "/" in the column name without getting the following error.

20/06/26 08:05:02 ERROR tool.ImportTool: Encountered IOException running import job: java.io.IOException: com.sap.db.jdbc.exceptions.JDBCDriverException: SAP DBTech JDBC: [257] (at 12): sql syntax error: incorrect syntax near "/": line 1 col 12 (at pos 12)

The query that is getting generated by sqoop looks like that

SELECT MIN(/SOMETHING/KEY_COLUMN), MAX(/SOMETHING/KEY_COLUMN) FROM (select * from SCHEMA."/SOMETHING/TABLE_NAME") AS t1

The statement:

sqoop import -D org.apache.sqoop.splitter.allow_text_splitter=true \
--driver com.sap.db.jdbc.Driver \
--connect jdbc:sap://alias:port/ \
--split-by "/SOMETHING/KEY_COLUMN" \
--target-dir /target-dir \
--delete-target-dir \
--query "select * from SCHEMA.\"/SOMETHING/TABLE_NAME\" where 1=1 AND \$CONDITIONS" \
--as-parquetfile \
--username username \
--password pw \
--num-mappers 4 \
--verbose

How can I escape the --split-by column correctly?


Solution

  • It worked with using

    --split - by "("/SOMETHING/KEY_COLUMN")"\

    sqoop
    import -D org.apache.sqoop.splitter.allow_text_splitter = true\
      --driver com.sap.db.jdbc.Driver\
      --connect jdbc: sap: //alias:port/ \
      --split - by "(\"/SOMETHING/KEY_COLUMN\")"\
      --target - dir / target - dir\
      --delete - target - dir\
      --query "select * from SCHEMA.\"/SOMETHING/TABLE_NAME\" where 1=1 AND \$CONDITIONS"\
      --as - parquetfile\
      --username username\
      --password pw\
      --num - mappers 4\
      --verbose