Search code examples
oracle-databaseoracle11goracle-apexexternal-tables

Location and filename of most recent BADFILE when using external tables


Is there a way to detemine location/filename of latest BADFILE?

When I select from enternal table with BADFILE 'mytable_%a_%p.bad', how do I find out what specific values were %a and %p replaced with?

Or am I stuck with having mytable.bad which I can reliably query and hoping that there will be no race conditions?


Solution

  • As the documentation states

    %p is replaced by the process ID of the current process. For example, if the process ID of the access driver is 12345, then exttab_%p.log becomes exttab_12345.log.

    %a is replaced by the agent number of the current process. The agent number is the unique number assigned to each parallel process accessing the external table. This number is padded to the left with zeros to fill three characters. For example, if the third parallel agent is creating a file and bad_data_%a.bad was specified as the file name, then the agent would create a file named bad_data_003.bad.

    If %p or %a is not used to create unique file names for output files and an external table is being accessed in parallel, then output files may be corrupted or agents may be unable to write to the files.

    Having said that, you must remember the purpose of the badfile in the first place.

    The BADFILE clause names the file to which records are written when they cannot be loaded because of errors. For example, a record would be written to the bad file if a field in the data file could not be converted to the data type of a column in the external table. The purpose of the bad file is to have one file where all rejected data can be examined and fixed so that it can be loaded. If you do not intend to fix the data, then you can use the NOBADFILE option to prevent creation of a bad file, even if there are bad records.

    So the idea ( either for SQL Loader or External Tables with access driver oracle_loader ) is to have a file to store those records, the bad records, not to trace anything regarding them.

    Normally you have external tables associated to text files that you are receiving in a daily/weekly/monthly basis. You store on the badfile those records that can't be read/loaded according to your own table specification.

    You use then the LOGFILE to find what has happened. Those files are generated in the database directory where the external table is created, and you will have one for each time a badfile needs to be generated.