Search code examples
pythonoracle-databasecx-oracle

loading huge XLS data into Oracle using python


I have a 3+ million record XLS file which i need to dump in Oracle 12C DB (direct dump) using a python 2.7.

I am using Cx_Oracle python package to establish connectivity to Oracle , but reading and dumping the XLS (using openpyxl pckg) is extremely slow and performance degrades for thousands/million records.

From a scripting stand point used two ways-

  1. I've tried bulk load , by reading all the values in array and then dumping it using cursor prepare (with bind variables) and cursor fetchmany.This doesn't work well with huge data.

  2. Iterative loading of the data as it is being fetched.Even this way has performance issues.

What options and techniques/packages can i deploy as a best practise to load this volume of data from XLS to Oracle DB ?Is it advisable to load this volume of data via scripting or should i necessarily use an ETL tool ? As of now i only have option via python scripting so please do answer the former


Solution

  • If is possible to export your excel fila as a CSV, then all you need is to use sqlldr to load the file in db