I have below tables.
create table html_details(A int,b int,c int,d nvarchar(4))
create table pdf_details(A int,b int,c int,d nvarchar(4))
insert into pdf_details values(1,2,3,'pdf')
insert into pdf_details values(1,2,3,'pdf')
insert into pdf_details values(4,5,6,'pdf')
insert into html_details values(1,2,3,'html')
insert into html_details values(1,2,3,'html')
insert into html_details values(4,5,6,'html')
now i am using below query to avoid duplication in each tables.
select distinct a,b,c,d from html_details
union all
select distinct a,b,c,d from pdf_details
but above query gives poor performance because of distinct function in both query.so i am using distinct in outer query.Now performance is improved, but will it give same output?Both query are same in logic?
select distinct a,b,c,d from (
select a,b,c,d from html_details
union all
select a,b,c,d from pdf_details
)a
No. It will not return the same output.
Distinct in individual queries will get you unique records from both the queries and then it will be union-ed. So if there are similar rows in both the queries results, both of them will be present in final result.
Lets say your data is:
Table 1:
1,2,3,pdf
1,2,3,pdf
1,2,3,hello
Table 2:
1,2,3,html
1,2,3,html
1,2,3,hello
First approach's result will be (There's no distinct in final response) -
1,2,3,pdf
1,2,3,hello
1,2,3,html
1,2,3,hello
Second approach's result will be (There's a distinct in final response) -
1,2,3,pdf
1,2,3,html
1,2,3,hello
I hope this explains.