Combine two rows into one based on identifier

My dataset looks like this:

LBID	ADDRESS	FI	LI
101	7283	JOHN	SMITH
101	7283	JANE	JONES

Since John and Jane have the same LBID and the same Address, I want the output to look like this:

LBID	ADDRESS	FI	LI	FI2	LI2
101	7283	JOHN	SMITH	JANE	JONES

I checked on Stack Overflow for similar issues but they all deal with columns that have NULLs. None of mine have NULLs.

I tried STUFF, STRING_AGG and JOIN operations, but none can get it how I want it. My latest attempt was:

SELECT LBID, ADDRESS, STRING_AGG(FI, '& ') AS FI, LI
FROM TABLENAME
GROUP BY LBID, ADDRESS, LI

but since the last names are not the same it still puts it on two different rows.

Any thought on how this can be done?

Solution

If you assume a fixed schema with two record duplicates at best, you can count your duplicate records with a ROW_NUMBER window function. Since it can assume value 2 at best (two duplicates), you can solve the problem with a typical pivot approach and extract the corresponding rownum for each row.

WITH cte AS (
    SELECT *, ROW_NUMBER() OVER(PARTITION BY LBID, ADDRESS ORDER BY FI) AS rn
    FROM tab
)
SELECT LBID, ADDRESS,
       MAX(CASE WHEN rn = 1 THEN FI END) AS FI1,
       MAX(CASE WHEN rn = 1 THEN LI END) AS LI1,
       MAX(CASE WHEN rn = 2 THEN FI END) AS FI2,
       MAX(CASE WHEN rn = 2 THEN LI END) AS LI2
FROM cte
GROUP BY LBID, ADDRESS

Check the demo here.

Output:

LBID	ADDRESS	FI	LI	FI2	LI2
101	7283	JOHN	SMITH	JANE	JONES

If you have more than 2 duplicate records, it's sufficient to add more pivotal elements to your final query, and extracting coupled "rn=", as per following pseudocode:

WITH cte AS (
    SELECT *, ROW_NUMBER() OVER(PARTITION BY LBID, ADDRESS ORDER BY FI) AS rn
    FROM tab
)
SELECT LBID, ADDRESS,
       MAX(CASE WHEN rn = 1 THEN FI END) AS FI1,
       MAX(CASE WHEN rn = 1 THEN LI END) AS LI1,
       MAX(CASE WHEN rn = 2 THEN FI END) AS FI2,
       MAX(CASE WHEN rn = 2 THEN LI END) AS LI2,
       ...
       ...
       MAX(CASE WHEN rn = <n> THEN FI END) AS FI<n>
       MAX(CASE WHEN rn = <n> THEN LI END) AS LI<n>
FROM cte
GROUP BY LBID, ADDRESS

If you don't know how many duplicate records per (LBID, ADDRESS) you can have at most, you can run a query to know that value:

SELECT TOP(1) COUNT(1) AS cnt
FROM tab
GROUP BY LBID, ADDRESS
ORDER BY cnt DESC

This value will be the amount of couple of pivotal elements to extract:

       MAX(CASE WHEN rn = <n> THEN FI END) AS FI<n>
       MAX(CASE WHEN rn = <n> THEN LI END) AS LI<n>

for the specific case of your sample table, 2 (as per first query).

And if have so many duplicate records per single address, and want a generalized approach, you would go towards dynamic querying. Yet that approach could be less performant and prone to SQL injection.