sqlsql-serveretldml

Better way to INSERT/UPDATE a Customer table from a Key/Value style Property table?


I have a customer master table called Customer. Everything in this table comes from a key/value style table called Cust_Property.

The Cust_Property table has 3 columns:

CustomerID, Property, Value

The Property column may contain First_Name with a value of John. Sort of like a pre-pivoted table. I need to update the columns in Customer table with the values of the associated Property columns in Cust_Property table.

Rules

  • If there is a new CustomerID in the Cust_Property table, it will need to be added as a new row to the Customer table, as well as all appropriate properties.
  • All data in the Customer table will also be in the Cust_Property table. That means that not every record needs to be updated. Only those that have changed or are new.
  • Records are only added/updated in Customer table, not removed.
  • There are properties in the Property table where a corresponding column doesn't exist in the Customer table, so those are just ignored.

DDL

CREATE TABLE #Customer
(
    Customerid int,
    FirstName varchar(50),
    LastName varchar(50),
    Address1 varchar(100),
    Address2 varchar(100),
    Address3 varchar(100)
)

CREATE TABLE #Cust_Property
(
    CustomerID int,
    Property varchar(50),
    Value varchar(50)
)

INSERT INTO #Customer (Customerid, FirstName, LastName, Address1, Address2, Address3) 
VALUES(1, N'John', N'Smith', N'123 happy lane', NULL, NULL);

INSERT INTO #Customer (Customerid, FirstName, LastName, Address1, Address2, Address3)
VALUES(2, N'Dwight', N'Schrute', N'33 1st Ave', N'Apt 5', NULL);

INSERT INTO #Customer (Customerid, FirstName, LastName, Address1, Address2, Address3)
VALUES(3, NULL, NULL, NULL, NULL, NULL);


INSERT INTO #Cust_Property (CustomerID, Property, Value)
VALUES(3, N'First_Name', N'Michael');

INSERT INTO #Cust_Property (CustomerID, Property, Value)
VALUES(3, N'Last_Name', N'Scott');

INSERT INTO #Cust_Property (CustomerID, Property, Value)
VALUES(8, N'First_Name', N'Jim');

INSERT INTO #Cust_Property (CustomerID, Property, Value)
VALUES(8, N'Last_Name', N'Halpert');

INSERT INTO #Cust_Property (CustomerID, Property, Value)
VALUES(8, N'Address1', N'644 Scranton Rd');

INSERT INTO #Cust_Property (CustomerID, Property, Value)
VALUES(8, N'Nickname', N'Jimmy');

INSERT INTO #Cust_Property (CustomerID, Property, Value)
VALUES(1, N'First_Name', N'John');

Tables:

Customer

CustomerID FirstName LastName Address1 Address 2 Address 3
1 John Smith 123 happy lane
2 Dwight Schrute 33 1st Ave Apt 5
3

Cust_Property

CustomerID FirstName LastName
3 First_Name Michael
3 Last_Name Scott
8 First_Name Jim
8 Last_Name Halpert
8 Address1 644 Scranton Rd
8 Nickname Jimmy
1 First_Name John

Desired Customer table end result:

  • Customer 3's First_Name and Last_Name columns are updated
  • Add customer 8 to Customer table b/c it doesn't already exist
  • Update all of customer 8's properties except property Nickname because that doesn't exist in the Customer table
  • Ignore the First_Name property for CustomerID = 1 because it is the same in the Customer table, so no update needed.

My current method: first find and insert new CustomerIDs

INSERT INTO #Customer (CustomerID)
    SELECT DISTINCT CustomerID 
    FROM #Cust_Property a
    WHERE NOT EXISTS (SELECT * FROM #Customer x 
                      WHERE a.CustomerID = x.CustomerID)

Then update properties

 UPDATE #Customer
 SET #Customer.FirstName = a.Value
 FROM #Cust_Property a
 WHERE #Customer.CustomerID = a.CustomerID 
   AND a.Property = 'First_Name'

 UPDATE #Customer
 SET #Customer.LastName = a.Value
 FROM #Cust_Property a
 WHERE #Customer.CustomerID = a.CustomerID 
   AND a.Property = 'Last_Name'

 UPDATE #Customer
 SET #Customer.Address1 = a.Value
 FROM #Cust_Property a
 WHERE #Customer.CustomerID = a.CustomerID 
   AND a.Property = 'Address1'
    
 UPDATE #Customer
 SET #Customer.Address2 = a.Value
 FROM #Cust_Property a
 WHERE #Customer.CustomerID = a.CustomerID 
   AND a.Property = 'Address2'
    
 UPDATE #Customer
 SET #Customer.Address3 = a.Value
 FROM #Cust_Property a
 WHERE #Customer.CustomerID = a.CustomerID 
   AND a.Property = 'Address3'

In my actual tables, there are hundreds of different properties in Cust_Properties table, 40 columns in the Customer table that need updating and ~2M customer records.

Is there a better way that separate update statements for each? At present, I can't use an ETL tool, though I could technically use Python if that would be more efficient/faster.


Solution

  • maybe one update like -

    UPDATE c
        set c.[FirstName] = isnull(cp.[First_Name],c.[FirstName])
        ,c.[LastName] = isnull(cp.[Last_Name],c.[FirstName])
        ,c.[Address1] = isnull(cp.[Address1],c.[Address1])
    FROM #Customer c
    INNER JOIN (
            SELECT * 
            FROM #Cust_Property tb
            pivot(
                max(value)
                for Property in ( [First_Name],[Last_Name],[Address1],[Nickname])
                )pv
                )cp
        on c.Customerid = cp.CustomerID