Search code examples
mariadb

mariadb how to split string by delimiter into result set in a stored function


I'm trying to create a split function that I can call in my stored procedures, I know there is a function SUBSTRING_INDEX, I'm trying to return either a temporary table or a JSON array and failing. I don't have much to show:

BEGIN
#Splits a subject text "strSubject" using the supplied "strPattern"
#The results are returned in the temporary table tbl_split_results
 RETURN SUBSTRING_INDEX(strSubject, strPattern, 9999);
END

Example of call:

SELECT split("Simon\r\nSusan\r\nLewis\r\nJordan\r\nOliver\r\n", "\r\n");

This just returns a single string with the delimiter "\r\n" removed, what I want is a row for each item. The version of MariaDB is 11.5 (x64) running on Windows 10.0.19045.4780.

[Edit] so far, but still errors, not very helpful errors either, code:

CREATE DEFINER=`root`@`localhost` FUNCTION `split`(
    `ttxtSubject` TINYTEXT,
    `ttxtPattern` TINYTEXT
)
RETURNS JSON
LANGUAGE SQL
NOT DETERMINISTIC
CONTAINS SQL
SQL SECURITY DEFINER
COMMENT 'splits string by pattern'
BEGIN
#Splits a subject text "ttxtSubject" using the supplied "ttxtPattern"
#The results are returned in the temporary table tbl_split_results
    DECLARE intItr INT;
    #DECLARE jsonArray JSON;
    DECLARE ttxtResult TINYTEXT;
    SET intItr = 1;
    #SET jsonArray = Json_Make_Array();
    REPEAT
        SET ttxtResult = SUBSTRING_INDEX(ttxtSubject, ttxtPattern, intItr);
        IF ttxtResult THEN
            #Json_Array_Add(jsonArray, ttxtResult);
            SET intItr = intItr + 1;
        END IF;
    UNTIL NOT ttxtResult; 
    #RETURN jsonArray;
    RETURN NULL;
END

I've commented out several lines to try and debug, hasn't helped, the error shown when I try to save the function:

SQL Error (1064): You have an error in your SQL syntax; check the manual that corresponds to your MariaDB server version for the right syntax to use near 'LOOP`forLoop`;
RETURN jsonArray;
END' at line 23

I've edited the post several times in an effort to fix, with various bits commented out, it still gives the same error.

Another edit have progressed a little:

CREATE DEFINER=`root`@`localhost` FUNCTION `split`(
    `ttxtSubject` TINYTEXT,
    `ttxtPattern` TINYTEXT
)
RETURNS longtext CHARSET utf8mb4 COLLATE utf8mb4_bin
LANGUAGE SQL
NOT DETERMINISTIC
CONTAINS SQL
SQL SECURITY DEFINER
COMMENT 'splits string by pattern'
BEGIN
#Splits a subject text "ttxtSubject" using the supplied "ttxtPattern"
#The results are returned in the temporary table tbl_split_results
    DECLARE jsonArray JSON;
    DECLARE ttxtResult TINYTEXT;
    SET jsonArray = Json_Make_Array();
    forLoop:FOR intItr IN 1..9999
    DO
        SET ttxtResult = SUBSTRING_INDEX(ttxtSubject, ttxtPattern, intItr);
        IF NOT ttxtResult THEN
            LEAVE forLoop;
        END IF;
        SET jsonArray = JSON_ARRAY(jsonArray, ttxtResult);
        SET intItr = intItr + 1;
    END FOR forLoop;
    RETURN jsonArray;
END

But event though I'm using MariaDB 11.8, I get:

SQL Error (1305): FUNCTION dbname.Json_Make_Array does not exist

So I changed code back to:

CREATE DEFINER=`root`@`localhost` FUNCTION `split`(
    `ttxtSubject` TINYTEXT,
    `ttxtPattern` TINYTEXT
)
RETURNS longtext CHARSET utf8mb4 COLLATE utf8mb4_bin
LANGUAGE SQL
NOT DETERMINISTIC
CONTAINS SQL
SQL SECURITY DEFINER
COMMENT 'splits string by pattern'
BEGIN
#Splits a subject text "ttxtSubject" using the supplied "ttxtPattern"
#The results are returned in the temporary table tbl_split_results
    DECLARE jsonArray JSON;
    DECLARE ttxtResult TINYTEXT;
    SET jsonArray = JSON_ARRAY();
    forLoop:FOR intItr IN 1..9999
    DO
        SET ttxtResult = SUBSTRING_INDEX(ttxtSubject, ttxtPattern, intItr);
        IF NOT ttxtResult THEN
            LEAVE forLoop;
        END IF;
        SET jsonArray = JSON_ARRAY(jsonArray, ttxtResult);
        SET intItr = intItr + 1;
    END FOR forLoop;
    RETURN jsonArray;
END

However when I test with:

SELECT split("Simon\r\nSusan\r\nLewis\r\nJordan\r\nOliver\r\n", "\r\n");

The result shown is:

[]

Solution

  • In the end I made constructed the JSON array myself:

    CREATE DEFINER=`root`@`localhost` FUNCTION `split`(
        `mtxtSubject` MEDIUMTEXT,
        `mtxtPattern` MEDIUMTEXT
    )
    RETURNS mediumtext CHARSET latin1 COLLATE latin1_swedish_ci
    LANGUAGE SQL
    DETERMINISTIC
    CONTAINS SQL
    SQL SECURITY DEFINER
    COMMENT 'splits string by pattern'
    BEGIN
    #Splits a subject text "mtxtSubject" using the supplied "mtxtPattern"
    #The results are returned in the temporary table tbl_split_results
        DECLARE intItem, intItemLen, intPatternLen, intSkipLen INT; 
        DECLARE mtxtResult MEDIUMTEXT DEFAULT '[';
        DECLARE mtxtNew MEDIUMTEXT;
        SET intItem = 0;
        Set intPatternLen = LENGTH(mtxtPattern);
        txtLoop:LOOP
            IF LENGTH(mtxtSubject) = 0 THEN
                LEAVE txtLoop;
            END IF;
        #Search for the first match using the passed delimiter
            SET mtxtNew = SUBSTRING_INDEX(mtxtSubject, mtxtPattern, 1);
            SET intItemLen = LENGTH(mtxtNew);
            IF intItemLen = 0 THEN
        #Nothing taken, abort search        
                LEAVE txtLoop;
            END IF;
        #Work out total length to skip in subject mater 
            SET intSkipLen = intItemLen + intPatternLen;
        #Append the new item to the results
            SET mtxtResult = CONCAT(mtxtResult
                            ,IF(LENGTH(mtxtResult) > 1, ',', ''), "'", mtxtNew, "'");       
        #Skip over the content and delimiter just extracted 
            SET mtxtSubject = SUBSTRING(mtxtSubject, intSkipLen + 1);
        #Increment item number
            SET intItem = intItem + 1;
        END LOOP txtLoop;
        SET mtxtResult = CONCAT(mtxtResult, ']');
        RETURN mtxtResult;
    END
    

    Example:

    SELECT split("Simon\r\nPlatten", "\r\n");
    

    Result:

    ['Simon','Platten']