Search code examples
sqloracle-databasesasinsert-update

Compare and Insert using SAS or Oracle SQL


I have created two tables in SAS WORK Library using data queried from Oracle tables using implicit pass through connections. I am now looking to compare the values in colA of table_X with values in colP in table_Y and if they are not equal then insert the colA,colB values of table_X to colP and colQ values in table_Y. The colR value of table_Y needs to be updated by getting its value from a SQL Query. Sample data

Table_X is as follows

colA colB
DAVE 01-Apr-2022
MARK 05-Jan-2021
ROCK 10-FEB-2022

Table_Y is as follows

colP colQ colR
DAVE 01-Apr-2022 100

What I essentially want is If Value.Table_X.colA != Value.Table_Y.colP then INSERT that value of Table_X.colA and Table_X.colB INTO Table_Y.colP and Table_Y.colQ respectively. Table_Y.colR value would be filled by a SQL Query (which I can connect to SQL DBs using Implicit or Explicit Pass-Through connections)

Final Output for Table_Y should look like this

colP colQ colR
DAVE 01-Apr-2022 100
MARK 05-Jan-2021 200
ROCK 10-FEB-2022 500

How can I do this in SAS and/or Oracle SQL (I can push Table_X and Table_Y in Oracle using Explicit-Pass Through and use Oracle Syntax).


Solution

  • Let's just use NAME as your key variable and X and Y as the dataset names to reduce confusion.

    Sounds like you just want to get the list of values of NAME that are in X but not in Y. Easy enough in any flavor of SQL.

    proc sql;
      create table new as 
        select distinct name from X
        except
        select name from Y
     ;
    quit;
    

    Or if X and Y are already sorted by NAME then a simple data step will do.

    data new;
      merge x(in=inx) y(in=iny);
      by name;
      if inx and not iny and firt.name then output;
      keep name;
    run;
    

    Once you have that list you could upload it into your remote database and use it to drive the query to select those new observations to append to Y. Or if the list is small enough you could use the dataset to generate a macro variable that you could use to generate a WHERE clause in your remote query to just select those values.

    Once you have downloaded the new data you could use PROC APPEND to add it to your existing Y dataset. So if your new observations were downloaded into a dataset named NEWY then you could run a step like this:

    proc append base=Y data=NEWY;
    run;