I'm building a query that needs data from 5 tables. I've been told by a DBA in the past that specifying a list of columns vs getting all columns (*) is preferred from some performance/memory aspect. I've also been told that the database performs a JOIN operation behind the scenes when there's a list of tables in the FROM clause, to create one table (or view).
The existing database has very little data at the moment, as we're at a very initial point. So not sure I can measure the performance hit in practice. I am not a database pro. I can get what data I need. The dillema is, at what price.
Added: At the moment I'm working with MS SQL Server 2008 R2.
My questions are:
Is there a performance difference and why, between the following: a. SELECT ... FROM tbl1, tbl2, tbl3 etc for simplicity? (somehow I feel that this might be a performance hit) b. SELECT ... FROM tbl1 inner join tbl2 on ... inner join tbl3 on ... etc (would this be more explicit to the server and save on performance/memory)? c. SELECT ... FROM (select x,y,z from tbl1) as t1 inner join ... etc (would this save anythig? or is it just extra select statements that create more work for the server and for us)?
Is there yet a better way to do this?
Below are two queries that both get the slice of data that I need. One includes more nested select statements.
I apologize if they are not written in a standard form or helplessly overcomplicated - hopefully you can decipher. I try to keep them organized as much as possible.
Insights would be most appreciated as well. Thanks for checking this out.
5 tables: devicepool, users, trips, TripTracker, and order
Query 1 (more select statements):
SELECT
username,
base.devid devid,
tripstatus,
stops,
stopnumber,
[time],
[orderstatus],
[destaddress]
FROM
((
( SELECT
username,
devicepool.devid devid,
groupid
FROM
devicepool INNER JOIN users
ON devicepool.userid = users.userid
WHERE devicepool.groupid = 1
)
AS [base]
INNER JOIN
(
SELECT
tripid,
[status] tripstatus,
stops,
devid,
groupid
FROM
trips
)
AS [base2]
ON base.devid = base2.devid AND base2.groupid = base.groupid
INNER JOIN
(
SELECT
stopnumber,
devid,
[time],
MAX([time]) OVER (PARTITION BY devid) latesttime
FROM
TripTracker
)
AS [tracker]
ON tracker.devid = base.devid AND [time] = latesttime)
INNER JOIN
(
SELECT
[status] [orderstatus],
[address] [destaddress],
[tripid],
stopnumber orderstopnumber
FROM [order]
)
AS [orders]
ON orders.orderstopnumber = tracker.stopnumber)
Query 2:
SELECT
username,
base.devid devid,
tripstatus,
stops,
stopnumber,
[time],
[orderstatus],
[destaddress]
FROM
((
( SELECT
username,
devicepool.devid devid,
groupid
FROM
devicepool INNER JOIN users
ON devicepool.userid = users.userid
WHERE devicepool.groupid = 1
)
AS [base]
INNER JOIN
trips
ON base.devid = trips.devid AND trips.groupid = base.groupid
INNER JOIN
(
SELECT
stopnumber,
devid,
[time],
MAX([time]) OVER (PARTITION BY devid) latesttime
FROM
TripTracker
)
AS [tracker]
ON tracker.devid = base.devid AND [time] = latesttime)
INNER JOIN
[order]
ON [order].stopnumber = tracker.stopnumber)
Is there a performance difference and why, between the following: a. SELECT ... FROM tbl1, tbl2, tbl3 etc for simplicity? (somehow I feel that this might be a performance hit) b. SELECT ... FROM tbl1 inner join tbl2 on ... inner join tbl3 on ... etc (would this be more explicit to the server and save on performance/memory)? c. SELECT ... FROM (select x,y,z from tbl1) as t1 inner join ... etc (would this save anythig? or is it just extra select statements that create more work for the server and for us)?
a) and b) should result in the same query plan (although this is db-specific). b) is much preferred for portability and readability over a). c) is a horrible idea, that hurts readability and if anything will result in worse peformance. Let us never speak of it again.
Is there yet a better way to do this?
b) is the standard approach. In general, writing the plainest ANSI SQL will result in the best performance, as it allows the query parser to easily understand what you are trying to do. Trying to outsmart the compiler with tricks may work in a given situation, but does not mean that it will still work when the cardinality or amount of data changes, or the database engine is upgraded. So, avoid doing that unless you are absolutely forced to.