I have a table called player_stage. I am trying to prepare my data so I can put it into a data warehouse.
I currently have a unreliable work-around that involves a duplicates view and handpicking the values from the duplicates.
I need to create a query that gives duplicates the same surrogate key(sk).
Any idea how I can do this? I've been stuck on t his for 3 days.
If you are using MySQL 8+, then DENSE_RANK
can work here:
SELECT
PLAYER_ID,
PLAYER_NAME,
DB_SOURCE,
DENSE_RANK() OVER (ORDER BY PLAYER_NAME) SK
FROM yourTable;
The above call to DENSE_RANK
would assign the same SK
value to all records belonging to the same player name.
If you are using a version of MySQL earlier than 8+, then we can simulate the dense rank with user variables, e.g.
SELECT t1.PLAYER_ID, t1.PLAYER_NAME, t1.DB_SOURCE, t2.rn AS SK
FROM yourTable t1
INNER JOIN
(
SELECT PLAYER_NAME, @rn := @rn + 1 AS rn
FROM (SELECT DISTINCT PLAYER_NAME FROM yourTable) t, (SELECT @rn := 0) r
ORDER BY PLAYER_NAME
) t2
ON t1.PLAYER_NAME = t2.PLAYER_NAME
ORDER BY
t1.PLAYER_ID;