I recently reviewed a colleague's SQL Server function (in SQL Server 2008
) where he had put a T-SQL
query within a scalar user-defined function. That function was then used in the Select clause of a query.
I thought that having a query inside a function would have terrible performance because I assumed each row returned would have to run the query - even though that query is optimised to query on an index.
Here's an example of what I'm talking about:
create table [PERSON] (
ID int primary key,
FIRSTNAME NVARCHAR(100),
MIDDLENAME NVARCHAR(100) null,
LASTNAME NVARCHAR(100))
GO
INSERT INTO PERSON (ID, FIRSTNAME, MIDDLENAME, LASTNAME)
VALUES (1, 'BOB', 'M', 'BLUE')
INSERT INTO PERSON (ID, FIRSTNAME, MIDDLENAME, LASTNAME)
VALUES (2, 'VALERIE', 'J', 'GREEN')
INSERT INTO PERSON (ID, FIRSTNAME, MIDDLENAME, LASTNAME)
VALUES (3, 'SIMON', 'D', 'RED')
INSERT INTO PERSON (ID, FIRSTNAME, MIDDLENAME, LASTNAME)
VALUES (4, 'LIONEL', 'W', 'BROWN')
GO
-- Scalar Function with T-SQL
CREATE FUNCTION dbo.fn_FormatNameFromId(
@pnPersonId int
)
Returns nvarchar(300)
AS
Begin
return (Select FIRSTNAME + ' ' + MIDDLENAME + ' ' + LASTNAME
From PERSON
Where ID = @pnPersonId)
End
go
-- Scalar Function without T-SQL
CREATE FUNCTION dbo.fn_FormatNameFromValues(
@psFirstName nvarchar(100),
@psMiddleName nvarchar(100),
@psLastName nvarchar(100)
)
Returns nvarchar(300)
AS
Begin
return (@psFirstName + ' ' + @psMiddleName + ' ' + @psLastName)
End
go
-- T-SQL within function
select dbo.fn_FormatNameFromId(ID)
from PERSON
-- Pass values directly
select dbo.fn_FormatNameFromValues(FIRSTNAME, MIDDLENAME, LASTNAME)
FROM PERSON
When comparing the execution plans for both these selects, I could find no difference in the cpu or memory performance of the two functions. Even when scaling up the test to return tens of thousands of rows, there was no performance difference.
Can anybody explain how or why the fn_FormatNameFromId
is not causing a loss of performance?
You can see the difference between Estimated Execution Plan
and Actual Execution Plan
You are personally expecting select dbo.fn_FormatNameFromId(ID)
from PERSON
to work as shown on the screenshot:
for every row from Query 1 (Clustered Index Scan) do Query 2 (Clustered Index Seek - or Subquery search by ID)
But sql-server sees the overall pattern of these queries and optimizes actual query plan to solely one simple Clustered Index Scan
as sql-server does in your second query select dbo.fn_FormatNameFromValues(FIRSTNAME, MIDDLENAME, LASTNAME)
FROM PERSON
.