I'm pretty new to SQL and thought I'd be able to use it to create a list for my employer's clients. Unfortunately, there the majority of clients have more than one account and the file has a distinct row for each account.
I was trying to use a self join to create one row for each client with multiple columns for the accounts.
SELECT DISTINCT A.Account_Number AS Account_1, B.Account_Number AS Account_2, A.Client_Name
FROM client_table AS A, client_table AS B
WHERE A.Account_Number <> B.Account_Number
AND A.Client_Name = B.Client_Name
ORDER BY A.Client_Name;
Unfortunately, the results were such that I would get a table that looks like:
Account_1 | Account_2 | Client_name |
---|---|---|
000001 | 000002 | Joe Shmo |
000001 | 000003 | Joe Shmo |
000002 | 000003 | Joe Shmo |
000002 | 000001 | Joe Shmo |
I understand that for more than two accounts, I'll need more than two joins, but I haven't figured out how to do it.
Is there a way to prevent double entry?
I'm using BigQuery btw.
In the first step number the accounts per client. Then use conditional aggregation to get one column per account.
select
client_name,
max(case when rn = 1 then account_number end) as account1,
max(case when rn = 2 then account_number end) as account2,
max(case when rn = 3 then account_number end) as account3,
max(case when rn = 4 then account_number end) as account4,
max(case when rn = 5 then account_number end) as account5,
max(case when rn = 6 then account_number end) as account6
from
(
select
client_name,
account_number,
row_number() over (partition by client_name order by account_number) as rn
from client_table
) numbered
group by client_name
order by client_name;