Search code examples
mysqlregexregexp-replace

MySQL: Convert Field with StackOverflow-style Coded URLs to HTML


We have a legacy database table full of StackOverflow-like coded URLs.

=====================================================================================
| TABLE_COMMENTS                                                                    |
=====================================================================================
| f_ID | f_comment                                                                 |
-------------------------------------------------------------------------------------
|   1  | To buy coffee [click here](https://google.com) or [here](https://bing.com) |
|   2  | Check [this out](https://stackoverflow.com)                                |
|   3  | [Cat Photos](https://google.com/images/?cats)                              |
=====================================================================================

I need to SELECT these as standard HTML so they look like the below table.

| f_comment                                                                                        |
----------------------------------------------------------------------------------------------------
| To buy coffee <a href="https://google.com">click here</a> or <a href="https://bing.com">here</a> |
| Check <a href="https://stackoverflow.com">this out</a>                                           |
| <a href="https://google.com/images/?cats">Cat Photos</a>                                         |
====================================================================================================

I have a method that will work if there is just one URL in the f_comment field, but I can't work out how to make it work for, like row 1 where there are two URLs.

SELECT 
    CONCAT(
        substring_index(substring_index(f_comment, '[', 1),')', 1),
        "<a href=", 
        substring_index(substring_index(f_comment, '(', -1),')', 1),
        ">",
        substring_index(substring_index(f_comment, '[', -1),']', 1),
        "</a>"
    )
    AS f_replacementtext
FROM TABLE_COMMENTS;

SQL Fiddle

Results of my incomplete query, where it misses the first URL of f_ID 1

 ==========================================================
| f_comment                                                |
 ----------------------------------------------------------
| To buy coffee <a href="https://bing.com">here</a>        |
| Check <a href="https://stackoverflow.com">this out</a>   |
| <a href="https://google.com/images/?cats">Cat Photos</a> |
 ==========================================================

Solution

  • set @col =
    'To buy coffee <a src=https://google.com>click here</a> or <a src=https://bing.com>here</a>';
    Query OK, 0 rows affected (0.00 sec)
    
    mysql> SELECT REGEXP_REPLACE( @col, '<a src=(http.*?)>(.*?)</a>', '[$2]($1)');
    +----------------------------------------------------------------------------+
    | REGEXP_REPLACE( @col, '<a src=(http.*?)>(.*?)</a>', '[$2]($1)')            |
    +----------------------------------------------------------------------------+
    | To buy coffee [click here](https://google.com) or [here](https://bing.com) |
    +----------------------------------------------------------------------------+
    

    Going the other way...

    SET @col =
    'To buy coffee [click here](https://google.com) or [here](https://bing.com)';
    SELECT
    REGEXP_REPLACE( @col, '\\[(.*?)\\][(](http.*?)[)]',
                          '<a src=\$2>\$1</a>' ); 
    
    +------------------------------------------------------------------------------------------------------+
    | REGEXP_REPLACE( @col, '\\[(.*?)\\][(](http.*?)[)]',
                              '<a src=\$2>\$1</a>' ) |
    +------------------------------------------------------------------------------------------------------+
    | To buy coffee <a src=https://google.com>click here</a> or <a src=https://bing.com>here</a>           |
    +------------------------------------------------------------------------------------------------------+
    1 row in set (0.00 sec)