I need a query that will search for all records in a table and return only the ones that have 2 or more values that end with same 4 digits, but i dont have specific four digits i can give you, i just need it to find everything that has more than one value that ends the same. eg. table has:
person_number------------name
==============================
1234567------------------john
00004567-----------------mark
9999999------------------mike
0009999------------------lidia
10101010-----------------alex
23232323-----------------matt
98748484-----------------steve
I want it to return only:
person_number------------name
==============================
1234567------------------john
00004567-----------------mark
9999999------------------mike
0009999------------------lidia
You see, i want it to return duplicates that end with the same last 4 digits, but i dont have specific digits to give you so i cant use something like '%9999'. It also needs to be sorted properly so I have them next to each other. I need to find duplicates and delete one of them, but i dont know which one until i actually see all the other values in other columns and pick which one to delete manually.
Thanks!
I just want extend Sick answer.
You say you will like to choose which one to eliminate. But you can also include a ORDER
clausule with a CASE
statment to filter the one you want eliminate.
In this case I order by "name"
so you can delete all with rn > 1
and keep the first name.
select "person_number", "name", rn, zero_count
from
(
select "person_number",
"name",
substr("person_number", 1, 1),
count(1) over (partition by substr("person_number",-4)) as Cnt,
SUM(case
when substr("person_number", 1, 1) = '0' then 1
else 0 end) over (partition by substr("person_number",-4)) as zero_count,
row_number() over (partition by substr("person_number",-4) order by "name") as rn
from person
)
Where Cnt > 1
and zero_count > 0
ORDER BY substr("person_number",-4)
I increase the data sample
zero_count
to calculate how many rows start with 0
in each groupZERO_COUNT = 2
)