Search code examples
sql-servert-sqldata-cleaning

how to remove trailing spaces from the bracket which is a substring of another string


I'm using SQL and some of column values are represented as {3P Test } * 1 + 0.45%.

Here I have trailing spaces after Test and I want to update that to {3P Test} * 1 + 0.45%. How can I update that column so that trailing spaces in brackets are removed?


Solution

  • If this is a one-time update, or if there aren't that many rows, then an iterative approach (as has been suggested in comments on the question) can work, but this is not something you want to do if there is a lot of data as each iteration of UPDATE is a transaction (unless you explicitly execute BEGIN TRANSACTION prior to the WHILE loop, but then if that operation takes a while it will block the table from other operations).

    For example:

    DECLARE @TestTable TABLE (String NVARCHAR(4000));
    INSERT INTO @TestTable (String) VALUES (N'{3P Test   } * 1 + 0.45%');
    
    SELECT * FROM @TestTable;
    
    WHILE (EXISTS(
              SELECT *
              FROM @TestTable tmp
              WHERE tmp.String LIKE N'% }%'
              )
          )
    BEGIN
       UPDATE TOP (100) tmp -- do this in batches to reduce transaction size
       SET tmp.String = REPLACE(tmp.String, N' }', N'}')
       FROM @TestTable tmp
       WHERE tmp.String LIKE N'% }%';
    END;
    
    SELECT * FROM @TestTable;
    

    OR, here is a simplified version of the CHARINDEX idea proposed by @Jodrell that uses a CTE rather than multiple subqueries (does not need to JOIN to the base table) and breaks the string into just 2 instead of 3 pieces:

    DECLARE @TestTable TABLE (String NVARCHAR(4000));
    INSERT INTO @TestTable (String) VALUES (N'{3P Test               } * 1 + 0.45%');
    INSERT INTO @TestTable (String) VALUES (N'{3P Test     4       } * 1 + 0.45%');
    
    SELECT * FROM @TestTable;
    
    WHILE (EXISTS(
              SELECT *
              FROM @TestTable tmp
              WHERE tmp.String LIKE N'% }%'
              )
          )
    BEGIN
    
       ;WITH cte AS
       (
          SELECT TOP (100)
                 tmp.String,
                 CHARINDEX(N'}', tmp.String) AS [RightBracketLocation]
          FROM @TestTable tmp
          WHERE tmp.String LIKE N'% }%'
       )
       UPDATE cte
       SET cte.String =
                 RTRIM(SUBSTRING(cte.String, 1, (cte.[RightBracketLocation] - 1)))
                 + SUBSTRING(cte.String, [RightBracketLocation], 4000)
       FROM cte;
    
    END;
    
    SELECT * FROM @TestTable;
    

    But, if this is to be a common operation, then something more straight forward and direct is better, such as Regular Expressions. There are no built-in Regular Expression functions so you need to use SQLCLR to get that functionality. For example:

    DECLARE @Test NVARCHAR(4000);
    SET @Test = N'{3P Test   } * 1 + 0.45%';
    
    SELECT SQL#.RegEx_Replace(@Test, N'\s+}', N'}', -1, 1, N'IgnoreCase');
    

    In the example I am using the SQL# library (which I am the author of, but the Regular Expression functions are available in the Free version).