Search code examples
sqldata-structuresdata-storage

Data structure enabling "Search by order"


I would like to know what data structure / storage strategy I should use for this problem.

Each data entry in the database consists of a list of multiple ordered items, such as A-B-C-D, where A, B, C, D are different items.

Suppose I have 3 entries in a database,

A-B-C-D

E-F-G

G-H-B-A

When the user entered some unordered items, I have to find the matching ordered entry(ies) from the database. For example, if user enters A,B,G,H, I want to return G-H-B-A from the database to the user.

What should be my data storage strategy?


Solution

  • You're best off storing the ordered and unordered elements separately, otherwise you'll need to search on all permutations of the ordered elements, which would be time consuming.

    Try this:

    /* Create a table to track your items (A, B, C, etc.). It contains all possible elements */
    CREATE TABLE [Items](
        [Value] [char](1) NOT NULL,
     CONSTRAINT [PK_Items] PRIMARY KEY CLUSTERED ([Value]))
    
    /* Create a table to track their grouping and stated ordering */
    CREATE TABLE [Groups](
        [ID] [int] NOT NULL,
        [Order] [text] NOT NULL,
     CONSTRAINT [PK_Groups] PRIMARY KEY CLUSTERED ([ID]))
    
    /* Create a mapping table to associate them */
    CREATE TABLE [ItemsToGroups](
        [Item] [char](1) NOT NULL,
        [Group] [int] NOT NULL
    )
    
    ALTER TABLE [ItemsToGroups]  WITH CHECK ADD CONSTRAINT [FK_ItemsToGroups_Groups] FOREIGN KEY([Group])
    REFERENCES [Groups] ([ID])
    
    ALTER TABLE [ItemsToGroups] CHECK CONSTRAINT [FK_ItemsToGroups_Groups]
    
    ALTER TABLE [ItemsToGroups]  WITH CHECK ADD CONSTRAINT [FK_ItemsToGroups_Items] FOREIGN KEY([Item])
    REFERENCES [Items] ([Value])
    
    ALTER TABLE [ItemsToGroups] CHECK CONSTRAINT [FK_ItemsToGroups_Items]
    
    /* Populate your tables. 
       Items should have eight rows: A, B, C,...H
       Groups should have three rows: 1:ABCD, 2:EFG, 3:GHBA
       Items to groups should have eleven rows: A:1, B:1,...A:3 */
    
    /* You will want to pass in a table of values, so set up a table-valued parameter
       First, create a type to support your input list */
    CREATE TYPE ItemList AS TABLE (e char(1) NOT NULL PRIMARY KEY)
    DECLARE @Input ItemList
    GO
    
    /* Create a stored procedure for your query */
    CREATE PROCEDURE SelectOrderedGroup @Input ItemList READONLY AS
        SELECT *
        FROM Groups
        WHERE Groups.ID NOT IN (
            SELECT [Group]
            FROM ItemsToGroups
            WHERE Item NOT IN (SELECT e FROM @Input)
        )
    GO
    
    /* Now when you want to query them: */
    DECLARE @MyList ItemList
    INSERT @MyList(e) VALUES('G'),('H'),('B'),('A')
    EXEC SelectOrderedGroup @MyList
    

    The above will return 3:GHBA, like you want. If you pass in DCBA you'll get back 1:ABCD, again like you're looking for. If you pass in C, you'll get back nothing, as no group consists of just C.

    You will probably want to use a table-valued parameter for your input, as shown above, but you could convert the final SELECT to a simple list and drop the ItemList type.