My requirement is to jumble up the names in the oracle table for obfuscation purpose as shown below
Rules
Table employee
ID | Name | Gender |
---|---|---|
1 | Peter | M |
2 | Pascal | M |
3 | Robin | M |
4 | Stephanie | F |
5 | Arya | F |
Table employee -expcted
ID | Name | Gender |
---|---|---|
1 | Robin | M |
2 | Peter | M |
3 | Pascal | M |
4 | Arya | F |
5 | Stephanie | F |
What I tried till now is mentioned in the below link, but somehow I am not able to fix it, as the data can be huge in millions http://sqlfiddle.com/#!4/460bda/5
You can randomize the names within each gender by using row_number()
:
select e.*, e2.name as new_name
from (select e.*,
row_number() over (partition by gender order by name) as seqnum
from employee e
) e join
(select e.*,
row_number() over (partition by gender order by dbms_random.random()) as seqnum
from employee e
) e2
on e.gender = e2.gender and e.seqnum = e2.seqnum;
This does not guarantee that names are never re-used. In fact, that is not possible given the other constraints -- a gender might have only one name. However, this does randomly assign the names, so keeping the same name is quite unlikely.
If your names can be duplicated, use dense_rank()
instead of row_number()
.
Here is a db<>fiddle.
You can avoid mapping the name back to the same name by using a "shifting" methodology. However, such shifting can be undone. The solution of randomly assigning the names with a very small probability of reassignment might be better for obfuscation.