Search code examples
mysql

SQL - Remove certain text within cell while keeping the other text


I am trying to remove the TMID from the cell as it takes up space and creates noise from the description.

I am looking to go from this (each row here is a single cell)

Transfer CH x9944 to CH x6664 TMID:T123456789123 111-DEP-0513-Check (2,

Transfer CH x8881 to CH x5554 TMID:1a2234e56-6dd4-2425-REDEPOSIT4851975

Transfer CH x1111 to CH x2225 TMID:d558w44h-64tt-6658-DF-0514-Check

To this, each row here still remains a single cell

Transfer CH x9944 to CH x6664 111-DEP-0513-Check (2,

Transfer CH x8881 to CH x5554 REDEPOSIT4851975

Transfer CH x1111 to CH x2225 DF-0514-Check

So in essence removing that TMID, thank you!

I'm not quite sure if a substring_index would work? Or maybe a REPLACE function but my knowledge isn't broad enough on this subject to really know so would appreciate the help!

I've looked at some other posts and I've noticed they have specific text to remove or even exact character count. With this ID of course they are different but they also differ in character length as well so I'm a bit stuck here.


Solution

  • Normally a simple replace is pretty easy, just replace this thing with that thing. However, as in this case, sometimes the issue is that there is not a specific standardization of ID values. There are letters, there are numbers, and the length varies. How can we find any sort of pattern with this type of situation?

    We must consider the things that we do know, such as what characters are we using? We can see they all start with the same prefix, so we know that is the same, then what is the min and max character count. But basically after we can work out the perfect regular expression to search for we can combine that with an UPDATE statement to make the changes.

    Before attempting to do any mass update like this, you should ALWAYS back your data up because it can be a one way process and too vast to manually repair by hand.

     UPDATE 
        YOURTABLENAMEHERE
     SET
       description = REGEXP_REPLACE(description, 'TMID:\\S{18,21}', '');
    

    The UPDATE command makes the change, replace YOURTABLENAMEHERE with the database.table location of the table in question. And we are wanting to set a new value for the description column. Using that Regex designation it will find all the values that start with TMID: and search for the criteria that encompasses all the types that match this pattern. In this case the length of 18 and 24 characters is used. And the last part of the replace with the '' means we want to replace that with an Empty value and thus delete it out.