Search code examples
sql-serverxmlt-sqlxquery

Strip value from XML node by locating what the value ends with


I have a column in SQL Server that is stored as XML, here is an example:

<data>
<pages>4935</pages>
<world_id>88BDBD41-736D-4CF1-4BC9-A6BF4935E481</world_id>
<Url>https://example.com/4935</Url>
<Token>https://example.com/80d71735733e91efc74e71a878eeaee5e84935</Token>
<phone>1122334935</phone>
<email>text4935@gmail.com</email>
<alt_number>334554935</alt_number>
</data>


<data>
<book_id>493567</book_id>
<page_id>88BDBD41-736D-4CF1-4BC9-A6BFE4814935</page_id>
<work_number>1114935</work_number>
</data>

The goal is to look through all possible nodes and find the number 4935 but only when it's at the end of its node. If this is located then I want to complete node returned:

The example above should only return the following data:

<pages>4935</pages>
<Url>http://example.org/4935</Url>
<Token>http://example.net/80d71735733e91efc74e71a878eeaee5e84935</Token>
<alt_number>334554935</alt_number>
<page_id>88BDBD41-736D-4CF1-4BC9-A6BFE4814935</page_id>
<work_number>1114935</work_number>

Here is what I have attempted.

I've taken all the data that contains the number 4935 into a #temp table:

SELECT xml_data, xml_id INTO #t
FROM table 
WHERE cast(xml_data as nvarchar(max)) LIKE '%4935%'

I altered XML column to be a string:

ALTER TABLE #t
ALTER COLUMN xml_data varchar(max);

This allows me to manipulate the data as a string. I have used a combination of SUBSTRING, PATINDEX, and LEN to grab what I want but it's not giving me the desired result:

SELECT xml_id, 
       xml_data, 
       Substring (xml_data, PATINDEX( '%4935%',xml_data) - 10,  -- Start Position
LEN(xml_data)           -- End Position.
       ) AS stripped 
FROM   #t

I'm not sure if my attempt in changing the XML to a string is the best approach so I am open to other functions that can accomplish what I am looking to do.

Thank you in advance for taking the time in answering this question.


Solution

  • It is better to stay with the XML data type. XQuery and FLWOR expression make it relatively easy.

    It would be even easier if SQL Server would support the fn:ends-with() function. But MS SQL Server is lagging in its w3c XQuery standards support.

    I recently asked Microsoft to beef up NoSQL functionality in SQL Server.

    Please check it out, add your comments, and vote for it:

    https://feedback.azure.com/d365community/idea/153a9604-7025-ec11-b6e6-000d3a4f0da0

    SQL

    -- DDL and sample data population, start
    DECLARE @tbl TABLE (ID INT IDENTITY PRIMARY KEY, xml_data XML);
    INSERT INTO @tbl (xml_data) VALUES
    (N'<root>
        <data>
            <pages>4935</pages>
            <world_id>88BDBD41-736D-4CF1-4BC9-A6BF4935E481</world_id>
            <Url>https://example.com/4935</Url>
            <Token>https://example.com/80d71735733e91efc74e71a878eeaee5e84935</Token>
            <phone>1122334935</phone>
            <email>text4935@gmail.com</email>
            <alt_number>334554935</alt_number>
        </data>
        <data>
            <book_id>493567</book_id>
            <page_id>88BDBD41-736D-4CF1-4BC9-A6BFE4814935</page_id>
            <work_number>1114935</work_number>
        </data>
    </root>');
    -- DDL and sample data population, end
    
    DECLARE @searchFor VARCHAR(20) = '4935';
    DECLARE @searchForLength INT = (LEN(@searchFor) - 1);
    
    SELECT ID, xml_data.query('
            for $x in /root/data/*
            let $len := string-length(string($x/text()[1]))
            return $x[substring($x/text()[1], $len - sql:variable("@searchForLength"), $len) eq sql:variable("@searchFor")]
        ')
    FROM @tbl;
    

    Output

    <pages>4935</pages>
    <Url>https://example.com/4935</Url>
    <Token>https://example.com/80d71735733e91efc74e71a878eeaee5e84935</Token>
    <phone>1122334935</phone>
    <alt_number>334554935</alt_number>
    <page_id>88BDBD41-736D-4CF1-4BC9-A6BFE4814935</page_id>
    <work_number>1114935</work_number>
    

    It is also possible to move XPath predicate to the where clause:

    SELECT ID, xml_data.query('
            for $x in /root/data/*
            let $len := string-length(string($x/text()[1]))
            where substring($x/text()[1], $len - sql:variable("@searchForLength"), $len) eq sql:variable("@searchFor")
            return $x
    ')
    FROM @tbl;
    

    A dream where MS SQL Server fully supports w3c XQuery standards

    SELECT ID, xml_data.query('
        for $x in /root/data/*[ends-with(./text()[1], sql:variable("@searchFor"))]
        return $x
    ')
    FROM @tbl;
    

    To search in any position

    SELECT ID, xml_data.query('
        for $x in /root/data/*[contains(./text()[1], sql:variable("@searchFor"))]
        return $x
    ')
    FROM @tbl;