Search code examples
sqluser-defined-functionsudfsql-server-2016table-valued-parameters

Does Table-Valued Function (SQL) create table on each call? [performance]


Okay this might sound a noob question, but SQL isn't my really strength, so I am requesting some help here.

I am trying to implement something, but I am concerned about performance issues.

The problem I am trying to fix is something like this:

I have a column with a lot of data separated by commas "," Something like this: data1,data2,data3,data57

What I need is looping through each piece of data separated by commas for all the records, and then do something with that single piece data, do you get it?

I found a solution that can actually help me, but I am worried about system performance, because I might need to make multiple calls to this function using different parameters!

Does a table is created on each call I made to the Table-Valued Function (UDF) or does the sql server saves it as cache? [maybe I would rather need a temporary table?]

Thank you for your help in advance!


Note: The data is not mine, and I should use it as is, so suggesting to change the database is out of question (however I know that would be the best scenario). a Note2: The purpose of this question/problem is to import initial data to the database, performance may not be a serious problem since it won't run many times, but still I wanna regard that issue, and do it the best way I can!


Solution

  • User defined, table-valued functions that are composed of multiple statements, as the one you found is, will create an object in the tempdb system database, populate it and then dispose of it when the object goes out of scope.

    If you want to run this multiple times over the same parameters, you might consider creating a table variable and caching the result in that yourself. If you're going to be calling it on different lists on comma-separated values though, there's not a great way of avoiding the overhead. SQL Server isn't really built for lots of string manipulation.

    Generally, for one-off jobs, the performance implications of this tempdb usage is not going to be a major concern for you. It's more concerning when it's a common pattern in the day-to-day of the database life.

    I'd suggest trying, if you can, on a suitably sized subset of the data to gauge the performance of your solution.

    Since you say you're on SQL Server 2016, you can make use of the new STRING_SPLIT function, something like

    SELECT t.Column1, t.Column2, s.value
    FROM table t
    CROSS APPLY STRING_SPLIT(t.CsvColumn, ',') s
    

    May get you close to where you want, without the need to define a new function. Note, your database needs to be running under the 2016 compatibility level (130) for this to be available, simply running on SQL 2016 isn't enough (they often do this with new features to avoid the risk of backwards-compatibility-breaking changes).