Search code examples
sql-serverselectsubstringasciicharindex

SQL Server - CHARINDEX on ASCII values


I have a column on a SQL Server table that has a lengthy value with line breaks. I am trying to find the values before each line break.

EXAMPLE:

--Column name: ItemDescription

Case Qty: 12
Weight: 8 oz.
Flavor code: PB
Size: STOCK
Cut: 1/8" x 1/8" x 3/16"
Additions: Bells
Cover Brine #: P1
Kosher Cert: OU
Organic Cert: 

This is EXACTLY what the I get when I copy the cell on my results and paste it. So I converted this field to VARBINARY and saw what ASCII codes are in here. Here is a part of ASCII interpretation of a value:

43 61 73 65 20 51 74 79 3A 20 31 32 0D0A 57 65 69 67 68 74 3A 20 38 20 6F 7A 2E 0D0A 46

0D0A meaning Carriage return and Line feed.

PREFERRED OUTCOME:

Now that the data is clear, I am trying to find the value after the colon and before the line break and put it in a new column.

So here is what my preferred outcome should look like:

enter image description here

WHAT I'VE TRIED:

Here is my current SQL Query to do this:

DECLARE @firstLine int
DECLARE @secondLine int
DECLARE @thirdLine int
DECLARE @fourthLine int
DECLARE @fifthLine int
DECLARE @sixthLine int
DECLARE @seventhLine int
DECLARE @eighthLine int


DECLARE @firstColon int
DECLARE @secondColon int
DECLARE @thirdColon int
DECLARE @fourthColon int
DECLARE @fifthColon int
DECLARE @sixthColon int
DECLARE @seventhColon int
DECLARE @eighthColon int
DECLARE @ninethColon int
DECLARE @itemDesc varchar(MAX)

SELECT 
    @itemDesc = ItemDescription
    ,@firstLine = CHARINDEX(CHAR(13), ItemDescription, 1)
    ,@secondLine = CHARINDEX(CHAR(13), ItemDescription, @firstLine + 1)
    ,@thirdLine =  CHARINDEX(CHAR(13), ItemDescription, @secondLine + 1)
    ,@fourthLine =  CHARINDEX(CHAR(13), ItemDescription, @thirdLine + 1)
    ,@fifthLine =  CHARINDEX(CHAR(13), ItemDescription, @fourthLine+ 1)
    ,@sixthLine =  CHARINDEX(CHAR(13), ItemDescription, @fifthLine + 1)
    ,@seventhLine =  CHARINDEX(CHAR(13), ItemDescription, @sixthLine + 1)
    ,@eighthLine = CHARINDEX(CHAR(13), ItemDescription, @seventhLine + 1)

    ,@firstColon = CHARINDEX(CHAR(58), ItemDescription, 1)--aaa
    ,@secondColon = CHARINDEX(CHAR(58), ItemDescription, @firstLine + 1)
    ,@thirdColon = CHARINDEX(CHAR(58), ItemDescription, @secondLine + 1)
    ,@fourthColon = CHARINDEX(CHAR(58), ItemDescription, @thirdLine + 1)
    ,@fifthColon = CHARINDEX(CHAR(58), ItemDescription, @fourthLine + 1)
    ,@sixthColon = CHARINDEX(CHAR(58), ItemDescription, @fifthLine + 1)
    ,@seventhColon = CHARINDEX(CHAR(58), ItemDescription, @sixthLine + 1)
    ,@eighthColon = CHARINDEX(CHAR(58), ItemDescription, @seventhLine + 1)
    ,@ninethColon = CHARINDEX(CHAR(58), ItemDescription, @eighthLine + 1)
FROM TableName

SELECT 
    ItemDescription
    ,CONVERT(VarBInary, itemDescription)
    ,LTRIM(SUBSTRING(ItemDescription, @firstColon + 2, @firstLine - (@firstColon - 1))) as caseQty          --1
    ,LTRIM(SUBSTRING(ItemDescription, @secondColon + 2, @secondLine - (@secondColon - 1))) as caseQty           --2
    ,LTRIM(SUBSTRING(ItemDescription, @thirdColon + 2, @thirdLine - (@thirdColon - 1))) as FlavorCode
    ,LTRIM(SUBSTRING(ItemDescription, @fourthColon + 2, @fourthLine - (@fourthColon - 1))) as Size
    ,LTRIM(SUBSTRING(ItemDescription, @fifthColon + 2, @fifthLine - (@fifthColon - 1))) as Cut
    ,LTRIM(SUBSTRING(ItemDescription, @sixthColon + 2, @sixthLine - (@sixthColon - 1))) as Additions
    ,LTRIM(SUBSTRING(ItemDescription, @eighthColon + 2, @eighthLine- (@eighthColon - 1))) as Brine
FROM
    TableName

THE ISSUE:

For some reason, the SUBSTRING isn't getting the right substring! I get the Qty and Weight correctly. But if size is RELISHSTOCK, I get RELISHSTOC. For FlavorCode, I get ut: (substring of "Cut:"). For Cut, I get 8" x 3/. For Additions, I get Brin (Substring of "Cover Brine").

WHAT am I doing wrong!! I've tried multiple different combinations of the substring length. I know for a fact it is the length. But now that I realize all the starting positions after flavor code is screwed up, I am not too sure where the mistake is.

Any help would be greatly appreciated.

Thank you.


Solution

  • After about a week of struggling with this problem, and a week of trying to move my execution idea to the suggestions mentioned in the comments, I ended up finding a solution to this problem.

    Step 1:
    I found out that it was easier to manipulate a special character like ! than it was for line break and line feed (Char(10) and Char(13)).
    So the first thing I did was to replace all the line break characters to !. Like so:

    REPLACE(REPLACE(columnName, char(10), ''), char(13), '!')
    

    This makes sure that the ! always has a space before it, making it easier to deal with.

    Step 2:
    I found an answer on SO that had a table-value function to split a string. The function name is fnSplitString and it takes two parameters: @string nvarchar(MAX) and @delimited char(1).

    ALTER FUNCTION [dbo].[fnSplitString] 
    ( 
        @string NVARCHAR(MAX), 
        @delimiter CHAR(1) 
    ) 
    RETURNS @output TABLE(columnName NVARCHAR(MAX), outValue NVARCHAR(MAX)) 
    BEGIN 
        DECLARE 
            @start INT
            ,@end INT
            ,@colon INT
    
        SELECT 
            @start = 1
            ,@colon = CHARINDEX(@delimiter, @string)
            ,@end = CHARINDEX(CHAR(33), @string)
        WHILE @start < LEN(@string) + 1 BEGIN 
            IF(SUBSTRING(@string, @colon + 1, 1) = ' ')
            BEGIN
                INSERT INTO 
                    @output (columnName, outValue)  
                VALUES
                    ((SUBSTRING(@string, @start, @colon - @start)) ,(SUBSTRING(@string, @colon + 2,  @end - @colon - 2)))
                SET @start = @end + 1 
                SET @colon = CHARINDEX(@delimiter, @string, @start)
                SET @end = CHARINDEX(CHAR(33), @string, @start)
            END
            ELSE
            BEGIN
                INSERT INTO 
                    @output (columnName, outValue)  
                VALUES
                    ((SUBSTRING(@string, @start, @colon - @start)) ,(SUBSTRING(@string, @colon + 1,  @end - @colon - 1)))
                SET @start = @end + 1 
                SET @colon = CHARINDEX(@delimiter, @string, @start)
                SET @end = CHARINDEX(CHAR(33), @string, @start)
            END
        END 
        RETURN 
    END
    

    I had to change a lot to the function because this function was originally meant for getting values AFTER a delimiter up to the end of the string, which is pretty straightforward. But mine had to go from the delimiter to the !, which is char(33).

    Step 3:
    I wrote a stored procedure that handles all these operations. I already had a pretty lengthy stored procedure in place for my requirement and I had to add the results of this function joined with the results of a select statement to a new CTE and use that in the final result set.

    For anyone that is curious, found in another SO question that the join of a table and a function can be done like so:

    SELECT
        table1.columnName1
        ,function1.columnName2
        ,function1.columnName3
    FROM
        tableName1 table1
    CROSS APPLY
        dbo.functionName1(table1.columnName2, ':') function1
    

    That did it.

    I hope this helps someone in the future.