Search code examples
sqlsql-serversql-server-2014

SQL SELECT rows in a table if the given condition is a substring of the column data with a separator ';' between the substrings


I am trying to get rows from the table Mails if the column [To] has the mail abc@mail.

The simple solution would be Select * from Mails where [To] = 'abc@mail'

But the thing is [To] column has data like 123@mail;abc@mail;aabc@mail etc separated by semicolons where To is multiple emails sent

I know I could do something like Select * from Mails where [To] like '%abc@mail%' but that won't solve the problem if the given mail is a substring of another mail. I thought of a split string solution

I have a split_string function like this,

CREATE FUNCTION [dbo].[split_string]
(
@string_value NVARCHAR(MAX),
@delimiter_character CHAR(1)
)
RETURNS @result_set TABLE(splited_data NVARCHAR(MAX)
)
BEGIN
DECLARE @start_position INT,
        @ending_position INT
SELECT @start_position = 1,
        @ending_position = CHARINDEX(@delimiter_character, @string_value)
WHILE @start_position < LEN(@string_value) + 1
        BEGIN
    IF @ending_position = 0 
       SET @ending_position = LEN(@string_value) + 1
    INSERT INTO @result_set (splited_data) 
    VALUES(SUBSTRING(@string_value, @start_position, @ending_position - @start_position))
    SET @start_position = @ending_position + 1
    SET @ending_position = CHARINDEX(@delimiter_character, @string_value, @start_position)
END
RETURN
END

which would return splitted string of a single data in a column and the function is working fine.

I tried executing the query

Select * 
from Mails 
where 'abc@mail' in (
   Select * 
   from dbo.split_string((SELECT [To] FROM Mails) , ';')
)

which is throwing the error:

Subquery returned more than 1 value. This is not permitted when the subquery follows =, !=, <, <= , >, >= or when the subquery is used as an expression.

I need help proceeding from here. I am using Microsoft SQL Server 2014.


Solution

  • TL;DR; Here is the query that you want

    SELECT *
    FROM dbo.Mails AS m
    WHERE EXISTS (
        SELECT *
        FROM dbo.split_string(m.[To], ';') s
        WHERE s.splited_data = 'abc@mail'
    )
    

    I recommend the splitting approach. Any character lookup will have to account the variability of the semi-colons, whereas splitting it out will handle the ambiguity of where the semi-colons are, and then you can do a direct equality check. If you wanted to take it a step further and look for additional [To] addresses you can just add an IN clause like this and SQL Server doesn't have to do much more work and you get the same results.

    SELECT *
    FROM dbo.Mails AS m
    WHERE EXISTS (
        SELECT *
        FROM dbo.split_string(m.[To], ';') s
        WHERE s.splited_data IN ('abc@mail', 'def@mail')
    )
    

    My answer is fairly similar to @Kitta answer in that we split the data out, and @Kitta is correct about the IN clause, but while their answer will work it will require you grouping your data back together to get a singular answer. Using the EXISTS clause will bypass all of that for you and only give you the data from the original table. That being said, please mark @Kitta as the answer if their answer works just as well for you.

    Here is the test setup that I used

    DROP TABLE Mails
    GO
    CREATE TABLE Mails
    ([To] VARCHAR(3000))
    
    INSERT INTO dbo.Mails
    (
        [To]
    )
    VALUES
    ('123@mail;abc@mail;aabc@mail')
    ,('[email protected]')
    ,('[email protected];[email protected]')
    ,('aabc@mail;ewrkljwe@mail')
    ,('ewrkljwe@mail')
    
    GO
    DROP FUNCTION [split_string]
    GO
    CREATE FUNCTION [dbo].[split_string]
    (
    @string_value NVARCHAR(MAX),
    @delimiter_character CHAR(1)
    )
    RETURNS @result_set TABLE(splited_data NVARCHAR(MAX)
    )
    BEGIN
    DECLARE @start_position INT,
            @ending_position INT
    SELECT @start_position = 1,
            @ending_position = CHARINDEX(@delimiter_character, @string_value)
    WHILE @start_position < LEN(@string_value) + 1
            BEGIN
        IF @ending_position = 0 
           SET @ending_position = LEN(@string_value) + 1
        INSERT INTO @result_set (splited_data) 
        VALUES(SUBSTRING(@string_value, @start_position, @ending_position - @start_position))
        SET @start_position = @ending_position + 1
        SET @ending_position = CHARINDEX(@delimiter_character, @string_value, @start_position)
    END
    RETURN
    END
    GO
    
    
    SELECT *
    FROM dbo.Mails AS m
    WHERE EXISTS (
        SELECT *
        FROM dbo.split_string(m.[To], ';') s
        WHERE s.splited_data = 'abc@mail'
    )
        
    

    and it returns the correct row of '123@mail;abc@mail;aabc@mail'