Search code examples
pythonpython-3.xpandasoracle-databasecx-oracle

from pandas dataframe upsert into oracle table based on primarykey


I have a Oracle table where the data is like below -

select * from test
GRP_ID  GRP_NM      MRG_ID
0024    Abac Expl   17
0027    Wlsy Inc    8404

I have a dataframe where the updated data is like below -

print(df)
    grp_id  grp_nm          mrg_id
0   0024    Abac Sol        17
1   0027    Wlsy Inc        8407
2   0029    Xyz Corp        1986

I want to update oracle table grp_id 0024 and 0027 with the updated value of grp_nm and mrg_id from dataframe. Also want to insert new grp_id 0029 into table. Could you please help?

Constraints: I do not have oracle db access so i can not create a temp table from df and then update test table from temp table. Also , this is just sample data. I have almost 200K rows to process.

Expected Output :

sql > select * from test; 
GRP_ID GRP_NM MRG_ID 
0024 Abac Sol 17 
0027 Wlsy Inc 8407 
0029 Xyz Corp 1986

Solution

  • You can use a Merge Statement along with your Dataframe

    import cx_Oracle
    import pandas as pd
    from pandas import DataFrame
    
    dataset = pd.DataFrame({'GRP_ID': ['0024','0027','0029'],
                            'GRP_NM': ['Abac Sol','Wlsy Inc','Xyz Corp'],
                            'MRG_ID': [17,8404,1986]})
    con = cx_Oracle.connect('uname/pwd@host:port/service_name')
    cursor = con.cursor()
    
    sql ='merge into test ';
    sql+=' using dual';
    sql+='   on ( grp_id = :1 )';
    sql+=' when matched then update set grp_nm = :2, mrg_id = :3';  
    sql+=' when not matched then insert( grp_id, grp_nm, mrg_id )'; 
    sql+='                       values( :1, :2, :3 )';
    df_list = dataset.values.tolist()
    
    for i in range(len(df_list)):
        cursor.execute(sql,df_list[i])
    
    con.commit()
    

    where the columns GRP_NM and MRG_ID are updated for already existing values for GRP_ID column within the table, and new records are added for non-existing values for GRP_ID column.