python sql sql-server pyspark apache-spark-sql

querying for duplicates in 2 tables SQL server

I have a table table1 containing only the duplicate person_number and its corresponding account_number. How do I query for these duplicate person_number from main table master_table which has no person_number column but only the account_number column?

table1

person_number	account_number
11	1
12	2
13	3
14	4
15	5

master_table

account_number	account_balance
1	238989
2	4889895
3	68897
4	298729
5	387745
6	354655
7	553455
8	35545

I would like to filter the master_table to only contain the account_number present in table1

resultant_table_of_duplicates

account_number	account_balance
1	238989
2	4889895
3	68897
4	298729
5	387745

Solution

As already pointed in the comments section, one idea to carry out this task is using the EXISTS operator inside the WHERE clause, by checking all records from mastertable whose "account_number" can be found inside table1.

SELECT * 
FROM mastertable m
WHERE EXISTS(SELECT 1 
             FROM table1 t 
             WHERE m.account_number = t.account_number)

Output:

account_number	account_balance
1	238989
2	4889895
3	68897
4	298729
5	387745

Check the demo here.