Search code examples
sqlsql-servercalculated-columnsstring-aggregation

Using SQL String_AGG for Computed Column


I'm trying to figure out the best way to address this scenario and eliminate the duplication of data. I have two tables: Product and Attribute. (SQL Server Express Edition (14.0.3281.6), SSMS v18.4, running on Windows Server 2019 Datacenter hosted in Azure)

Product:
SeqNum (PK Int Seed), ProductID (Unique Int), Description

Attribute:
SeqNum (PK Int Seed), ProductSeqNum (FK to Product table), Name, Value

Product is related to Attribute in a one-to-many relationship. A single product can have dozens of attributes. Typically, I will use String_Agg() to concatenate all of the attributes into a single column separated by commas so that a query for products and attributes returns 1 row per product with all the attributes in a single column. That works great for returning a product and all of its attributes.

The product table has a description field which is populated "manually" with description values that are often times, if not always, also attributes. For example:

Product with SeqNum 1 in the Product table has the following attribute records:

1, 1, "Name", "Widget"
2, 1, "Color", "Blue"
3, 1, "Size", "Large"
4, 1, "Weight", "150lbs"
5, 1, "Height", "3ft"
6, 1, "Width", "2ft"
7, 1, "Material", "Plastic"
8, 1, "Country of Origin", "USA"
9, 1, "Warranty Length", "2 Years"

And let's say the Description field in the Product table has "Widget, Blue, 150lbs, Large" in it because a human picked out those values and entered them into the description field.

In my mind, I think we should be able to add a column to the attribute table to identify the attributes we want to use in the description, in the order we want them listed, and have SQL calculate or build the description without duplicating the data.

For example:

Let's say we have the same Attribute records as above, but we add a Type column:

1, 1, "Name", "Widget", D1
2, 1, "Color", "Blue", D2
3, 1, "Size", "Large", D4
4, 1, "Weight", "150lbs", D3
5, 1, "Height", "3ft", A
6, 1, "Width", "2ft", A
7, 1, "Material", "Plastic", A
8, 1, "Country of Origin", "USA", A
9, 1, "Warranty Length", "2 Years", A

The Types that start with a "D" are descriptions, with the number following determining the order, and the "A" Types are simply attributes and we don't care about the sorting order.

What I cannot figure out is how to write a String_Agg that will concatenate just the attributes with Types that start with a D, then order them. I'm also not sure if this can be persisted to the database or only calculated during a query. Then there is also the matter of some products having more description attributes than others.

Is there a better way to solve this problem?


Solution

  • If I understand correctly, you can use conditional logic to get the attributes you want:

    select ProductSeqNum,
           string_agg(case when type like 'D%' then value end, ',')
    from t
    group by ProductSeqNum;
    

    Persisting this with the table is more trouble than it is worth. I would recommend calculating it on-the-fly unless you have a specific performance need to do otherwise.