Search code examples
sqlsql-serverxmlt-sqlxquery-sql

Using substrings in xml.modify method and possible alternatives


How can I bulk update a column in a SQL Server database?

The situation is the following: I have a column Config that is defined as nvarchar(max), however it contains formatted XML.

Here is an example:

<?xml version="1.0" encoding="utf-16"?>  
<ArrayOfLearningPathItem xmlns:xsd="http://www.w3.org/2001/XMLSchema" 
                         xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance">    
    <LearningPathItem xsi:type="type">      
        <Id>id</Id>      
        <Title>title</Title>      
        <Content>&lt;iframe allowfullscreen src="https://site.sharepoint.com/sites/site/SitePages/page.aspx" width="100%" height="100%" data-embedSize="3"&gt;&lt;/iframe&gt;</Content>      
        <EmbedType>typeType</EmbedType>    
    </LearningPathItem>  
</ArrayOfLearningPathItem>

I need to bulk update the Content element there so code will be deleted and only URL should stay.

so from this example the result should be

<Content>https://site.sharepoint.com/sites/site/SitePages/page.aspx</Content> 

I have tried to convert the column to XML, put it to the temporary table, search for all elements with the sharepoint link and update them with the new content, and then join the temp table with the old one.

However, I still can't find the way to cut the URL from the code. it's easier for other elements, it works well if I just change the EmbedType parameter:

UPDATE @TempTable
SET ConfigXML.modify('
    replace value of (/ArrayOfLearningPathItem/LearningPathItem[EmbedType="type1" and Content[contains(text()[1], "sharepoint.com")]]/EmbedType/text())[1] 
    with 
    "type2"
    ');

With the URL I would guess something like this could work, if substring-before and substring-after could be used inside XML modify method, but only substring can be used

UPDATE @TempTable
SET ConfigXML.modify('
    replace value of (/ArrayOfLearningPathItem/LearningPathItem[EmbedType="type1" and Content[contains(text()[1], "sharepoint.com")]]/Content/text())[1]
    with 
    substring-before(substring-after(/ArrayOfLearningPathItem/LearningPathItem[EmbedType="type1" and Content[contains(text()[1], "sharepoint.com")]]/Content/text(), "src=&quot;"), "&quot;")
');

Solution

  • Unfortunately, SQL Server does not support most of the new(ish) XQuery functions. You need to extract the data into SQL, use CHARINDEX and SUBSTRING to get the correct data, and push it back into XQuery using sql:column

    UPDATE t
    SET ConfigXML.modify('
        replace value of (
             /ArrayOfLearningPathItem/LearningPathItem
              [EmbedType="type1"]
             /Content/text()[contains(data(.), "sharepoint.com")]
            )[1]
        with sql:column("v3.url")
    ')
    FROM @TempTable t
    CROSS APPLY (
        SELECT ContentValue =
          T.ConfigXML.value('
            (/ArrayOfLearningPathItem/LearningPathItem
              [EmbedType="type1"]
             /Content/text()[contains(data(.), "sharepoint.com")]
            )[1]', 'nvarchar(max)')
    ) x1
    CROSS APPLY (
        SELECT src = NULLIF(CHARINDEX('src="', x1.ContentValue), 0) + 5
    ) v1
    CROSS APPLY (
        SELECT quote = NULLIF(CHARINDEX('"', x1.ContentValue, v1.src), 0)
    ) v2
    CROSS APPLY (
        SELECT url = SUBSTRING(x1.ContentValue, v1.src, v2.quote - v1.src)
    ) v3;
    

    db<>fiddle