Search code examples
sqloracle-databasesql-loader

Unable to use the SQL*LOADER(sqlldr) to load the CSV file into the DB


As stated, I want to load the data from CSV to a DB. I have prepared the cluster file for this.

Here is the sample CSV data

Record_Type,Cluster_ID,Forecast_ID,Selling_class,Origin,Destination,Flow_date,POC,Cabin,Fare_Family,Departure_date_path,Airport_path,Operating_airline_path,Operating_flight_number_path,Operating_flight_suffix_path,Travel_purpose,Booking_Threshold,PNR_size_1,PNR_size_2,PNR_size_3,PNR_size_4,PNR_size_5,PNR_size_6,PNR_size_7,PNR_size_8,PNR_size_9,Forecast_Accuracy,Adjusted_Final_Demand,Pick_Up_Curve_Demand_Today,User_Manual_adjustment_relative_final_demand_forecast,User_Manual_adjustment_absolute_final_demand_forecast,User_Manual_adjustment_override_final_demand_forecast,User_Manual_adjustment_absolute_going_class,Robotic_Manual_adjustment_relative_final_demand_forecast,Robotic_Manual_adjustment_absolute_final_demand_forecast,Robotic_Manual_adjustment_override_final_demand_forecast,Robotic_Manual_adjustment_absolute_going_class,Significant_period_indicator,Snapshot_time
C,22,1,I,NCE,LHR,20250305,FR,C,SHC,20250305,NCELHR,BA,349,,,7,1,2,3,4,5,6,7,8,9,5,10.01,1.3,7.9,3.5,3.5,3.5,3.5,3.5,3.5,3.5,TRUE,2.03E+13

Here is the control file

OPTIONS (SKIP=1)
LOAD DATA
INFILE 'main_cluster_file.csv'
APPEND INTO TABLE YMODAPP.TCLUSTER
REPLACE
FIELDS TERMINATED BY ',' OPTIONALLY ENCLOSED BY '"'
TRAILING NULLCOLS
(
    ROW_ID "YMODAPP.TCLUSTER_SEQ.NEXTVAL",
    CLUSTER_ID "Cluster_ID",
    FORECAST_ID "Forecast_ID",
    CLS_CD "Selling_class",
    OD_ORIG_ARP_CD "Origin",
    OD_DEST_ARP_CD "Destination",
    OD_FLOW_DT "TO_DATE(:'Flow_date', 'YYYYMMDD')",
    CMPT_CD "Cabin",
    POC_CTRY "POC",
    FARE_FAMILY_CD "Fare_Family",
    ARP_PATH "Airport_path",
    OP_AIRLINE_PATH "Operating_airline_path",
    OP_FLT_PATH "Operating_flight_number_path",
    OP_FLT_SUFFIX "Operating_flight_suffix_path",
    TRVL_PURPOSE "Travel_purpose",
    BKG_THRESHOLD "Booking_Threshold",
    CREATE_DT_TM "SYSDATE",
    CREATED_BY "U"
)

INTO TABLE YMODAPP.TCLUSTER_SUMMARY
REPLACE
FIELDS TERMINATED BY ',' OPTIONALLY ENCLOSED BY '"'
TRAILING NULLCOLS
(
    ROW_ID "YMODAPP.TCLUSTER_SUMMARY_SEQ.NEXTVAL",
    CLUSTER_ID "Cluster_ID",
    FORECAST_ID "Forecast_ID",
    CLS_CD "Selling_class",
    PNR_SIZE_1 "PNR_size_1",
    PNR_SIZE_2 "PNR_size_2",
    PNR_SIZE_3 "PNR_size_3",
    PNR_SIZE_4 "PNR_size_4",
    PNR_SIZE_5 "PNR_size_5",
    PNR_SIZE_6 "PNR_size_6",
    PNR_SIZE_7 "PNR_size_7",
    PNR_SIZE_8 "PNR_size_8",
    PNR_SIZE_9 "PNR_size_9",
    FCST_ACCURACY "Forecast_Accuracy",
    ADJ_FINAL_DEMAND "Adjusted_Final_Demand",
    PICKUP_CURVE "Pick_Up_Curve_Demand_Today",
    USER_MANUAL_REL_FINAL_DMD "User_Manual_adjustment_relative_final_demand_forecast",
    USER_MANUAL_ABS_FINAL_DMD "User_Manual_adjustment_absolute_final_demand_forecast",
    USER_MANUAL_OVR_FINAL_DMD "User_Manual_adjustment_override_final_demand_forecast",
    USER_MANUAL_ABS_GOING_CLS "User_Manual_adjustment_absolute_going_class",
    ROBOTIC_REL_FINAL_DMD "Robotic_Manual_adjustment_relative_final_demand_forecast",
    ROBOTIC_ABS_FINAL_DMD "Robotic_Manual_adjustment_absolute_final_demand_forecast",
    ROBOTIC_OVR_FINAL_DMD "Robotic_Manual_adjustment_override_final_demand_forecast",
    ROBOTIC_ABS_GOING_CLS "Robotic_Manual_adjustment_absolute_going_class",
    SIGNIFICANT_PD "CASE WHEN :'Significant_period_indicator' = 'TRUE' THEN 1 ELSE 0 END",
    SNAPSHOT_DT_TM "SYSDATE",
    CREATE_DT_TM "SYSDATE"
)

My task is to extract the data I need from the main cluster file, and upload it to two databases, TCLUSTER, TCLUSTER_SUMMARY

I'm having issues with mapping the columns to the respective column names in the DB. Also, how can I map the column names if the column names in the CSV have spaces? Is there a better way to achieve this?

you can forget about the timestamp related data for now as I've manually loaded the SYSDATE to it.

Here is the error message from the loader while executing command to load to the database:

value used for ROWS parameter changed from 64 to 22 SQL*Loader-291: Invalid bind variable :'Flow_date' in SQL string for column OD_FLOW_DT.

I have tried removing the single quotation marks, but they were invain. Also tried including backslashes, but didn't work.

PLEASE HELP


Solution

  • As said by @Paul W in comments, fields in the control file are names that sqlldr internally gives to your CSV columns, ignoring the names you gave in the CSV's header line. And by default they get pushed to the field in the same order as they are listed.

    Thus for simple fields you should just let the field name (having them match the table columns name will help not getting lost) and remove the quoted part.

    After the internal field name, you can put an optional SQL type, or SQL convert or fill snippet; in that last case, you can refer the "internal" field name with :.

    Thus you could have:

    ROW_ID "YMODAPP.TCLUSTER_SEQ.NEXTVAL",
    CLUSTER_ID, -- No need to add anything, we just need to materialize the column and it will get pushed to column 2 of the table.
    […]
    OD_FLOW_DT "TO_DATE(:OD_FLOW_DT, 'YYYYMMDD')", -- Field in the TO_DATE has to be :<internal field name = first word before the quotes>.
    -- or:
    OD_FLOW_DT DATE 'YYYYMMDD',
    

    (based on past experience, with no DB to test now, but isn't it worth a try?)

    See an answer from 2019 for an unconventional way of remapping, that helps understanding what is in memory and what ends up into DB.