Search code examples
sqlsql-serversql-server-2008stored-procedureshashtable

Hash Table Data Structure in SQL Server


For the last few days, I've been reading an ebook on data structures and well, frankly speaking, many things are already gone from my head. Just reviewing them and trying to make clear again. I was going through hash tables and get to familiar with it again. So I know and heard, SQL Server uses hash tables internally and many of the threads of stackoverflow.com and forums.asp.net asked about creating hash tables in SQL Server as it stores temporary data. So let me give an example that I've used in a stored procedure using temp table: (Avoid it and it's too long. Just for an example)

1st:

CREATE PROCEDURE [dbo].[Orders]
    @OrderLine int
AS
BEGIN
    DECLARE @t1 TABLE(Date1 date, 
                      OrderID VARCHAR(MAX), 
                      EmployeeName VARCHAR(MAX), 
                      DeliveryDate date, 
                      StoreName VARCHAR(MAX),
                      DeliveryAddress VARCHAR(MAX), 
                      ItemName VARCHAR(MAX), 
                      Quantity FLOAT)

    INSERT INTO @t1(Date1, OrderID, EmployeeName, DeliveryDate, StoreName, DeliveryAddress, ItemName, Quantity)
        (SELECT DISTINCT 
             CONVERT(VARCHAR(11), DemandOrder.POCreationDate, 6) AS DemandOrderDate, 
             DemandOrder.OrderID, EmployeeDetails.EmployeeName,
             CONVERT(DATE, DemandOrder.DeliveryDate) AS ExpectedDeliveryDate, 
             StoreDetails.StoreName,
             DemandOrder.DeliveryAddress, Item.ItemName, 
             DemandOrderLine.Quantity 
         FROM 
             DemandOrder 
         INNER JOIN 
             DemandOrderLine ON DemandOrder.OrderID = DemandOrderLine.OrderID 
         INNER JOIN 
             Item on DemandOrderLine.ItemID=Item.ItemID 
         INNER JOIN 
             EmployeeDetails ON EmployeeDetails.EmployeeID = DemandOrder.EmployeeID 
         INNER JOIN 
             StoreDetails ON DemandOrderLine.StoreID = StoreDetails.StoreID
         WHERE 
             DemandOrderLine.OrderLine = @OrderLine)

    DECLARE @t2 TABLE(Approvedby VARCHAR(MAX)) 

    INSERT INTO @t2(Approvedby)
        (SELECT EmployeeDetails.EmployeeName 
         FROM EmployeeDetails 
         INNER JOIN DemandOrderLine ON DemandOrderLine.ApprovedBy = EmployeeDetails.EmployeeID)

    SELECT DISTINCT 
        CONVERT(VARCHAR(11), Date1, 6) AS Date, 
        OrderID, EmployeeName,
        CONVERT(VARCHAR(11), DeliveryDate, 6) AS ExpectedDeliveryDate, 
        StoreName, Approvedby, DeliveryAddress, 
        ItemName, Quantity  
    FROM 
        @t1 
    CROSS JOIN 
        @t2
END   

Another one, from an example, that says in stored procedure, hash tables can't be used. So here it's:

2nd:

CREATE PROCEDURE TempTable AS ---- It's actually not possible in SP

CREATE table #Color
(
    Color varchar(10) PRIMARY key
)

INSERT INTO #color 
    SELECT 'Red' 
    UNION 
    SELECT 'White'
    UNION 
    SELECT 'green'
    UNION 
    SELECT 'Yellow'
    UNION 
    SELECT 'blue'

DROP TABLE #color

CREATE table #Color
(
    Color varchar(10) PRIMARY key
)

INSERT INTO #color 
    SELECT 'Red' 
    UNION 
    SELECT 'White'
    UNION 
    SELECT 'green'
    UNION 
    SELECT 'Yellow'
    UNION 
    SELECT 'blue'

DROP TABLE #color
GO

So my question is can I say the 1st one is an example of hash table as it uses temp tables and if not, why can't we use it in the stored procedure? Again, if it's created internally, why do we need to create a hash table again for working purposes (Though it has performance issues, just wondering to know if the above examples serve for the purpose). Thanks.

Note: I faced an interview last month and was discussing about it. That's why making sure if I was correct in my views.


Solution

  • Hash-based algorithms are important for any powerful database. These are used for aggregation and join operations. Hash-based joins have been there since version 7.0 -- which is really old (thanks to Martin Smith). You can read more about them in the documentation.

    SQL Server 2014 introduced hash-based indexes for memory optimized tables (see here). These are an explicit use of hash tables. In general, though, the tree-based indexes are more powerful because they can be used in more situations:

    • For range lookups (including like).
    • For partial key matches.
    • For order by.

    A hash index can only be used for an exact equality match (and group by).