Search code examples
oracle-databaseparsingsubstr

oracle substring returns wrong text


Function below when it is run returns wrong text. Example. Table A contains msg_id and msg_text. Table B contains a string to be parsed.

For some reason I am able to correctly extract the location of all the msg_id from string in Table B. As in start position 2 end position 4. But when it comes to actually extracting from table B string substr (result_val, strt,endstr) returns either too much text or not enough. It might add a space to the substring, or it might add more text at the end. But the actual coordinates correctly match the text I need to extract. Seems like substr takes out a different value..any ideas?

Specifically the variable "extracted" shows the correct ID I'm looking for, but as soon as I try finding that ID anywhere I get wrong results.

Here is what I'm placing in dual

select ('@1 @2') from dual
Output:My comment for number 1 My comment for number 2

select (' @1 @2') from dual
Output: nothing

select ('@1') from dual
output: My comment for number 1

select result_parm(' @1') from dual;
output:  My comment for number 1 --note it did add the space correctly

select result_parm(' @1 @2') from dual;
output: none

A behind the scenes DBMS output shows me the following when parsing '@1 @2 @3'. I had to comment out the replace portion, it seems like something breaks in there and prevents any parse after the second item.

Behind the scenes dbms output: start string 1 end string 2 @1 <-- grabbed the @1 with substr, but it also grabbed an extra space after the 1 start string 4 end string 5 @2 @3 <-- why did it grab incorrectly @2 and @3 when position clearly says 4,5 I have a feeling this is what's also killing the replace but not sure why start string 7 end string 8 @3 <--grabbed correctly

Tables:

create table REC_CANNED_COMM
(
  ENTRY_NUM   INTEGER not null,
  MSG_ID      VARCHAR2(5),
  MSG_TEXT    VARCHAR2(200)
);

create table REC_RESULT
(
ENTRY_NUM  INTEGER not null,
RESULT_TXT VARCHAR2(200)
);


insert into rec_result values ('1','@1');
insert into rec_result values ('2','@1 and @2');
insert into rec_result values ('3','@1 and @2 and @3');
insert into rec_result values ('4','this is entry @4');
insert into rec_result values ('5','This one has no canned message');

insert into rec_canned_comm values ('1','@1','My comment for number 1');
insert into rec_canned_comm values ('2','@2','My comment for number 2');
insert into rec_canned_comm values ('3','@3','My comment for number 3');
insert into rec_canned_comm values ('4','@4','My comment for number 4');
insert into rec_canned_comm values ('5','@5','My comment for number 5');

CREATE OR REPLACE FUNCTION result_parm(result_val VARCHAR2) 
RETURN VARCHAR2   IS      
  num         NUMBER;
  tmp         VARCHAR2(2000);
  strt        NUMBER;
  endstr      NUMBER;
  tmp_msg     VARCHAR2(2000);
  tmp_msg_txt VARCHAR2(2000);
  extracted   VARCHAR2(2000);

  BEGIN

    tmp := result_val;
    num := regexp_count(result_val, '@');
    IF (num = 0)
    THEN RETURN result_val;
    END IF;

    FOR loop_count IN 1..num
    LOOP

      tmp_msg := '';
      tmp_msg_txt := '';

      strt := instr(result_val, '@', 1, loop_count);
      endstr := CASE
                WHEN regexp_instr(result_val,
                                  '[\,\.\;\:\''\     <\>\/\+\-]|\s',
                                  instr(result_val, '@', 1, loop_count)) = 0
                  THEN length(result_val)
                WHEN regexp_instr(result_val,
                                  '^[\,\.\;\:\''\<\>\/\+\-]|\s',
                                  instr(result_val, '@', 1, loop_count)) > 0
                  THEN regexp_instr(result_val,
                                    '^[\,\.\;\:\''\<\>\/\+\-]|\s',
                                    instr(result_val, '@', 1, loop_count)) - 1
                END;

      dbms_output.put_line('start string ' || strt);
      dbms_output.put_line('end string ' || endstr);

      extracted := substr(result_val, strt, endstr);

      dbms_output.put_line(extracted);

      SELECT
        rec_canned_comm.msg_id,
        rec_canned_comm.msg_text
      INTO tmp_msg,
           tmp_msg_txt
      FROM rec_canned_comm
      WHERE rec_canned_comm.msg_id = extracted;

      tmp := replace(tmp, tmp_msg, tmp_msg_txt);
    END LOOP;

    RETURN tmp;

  END;

Solution

  • I was using endstr inappropriately. where ever you see endstr in a instr or substr it should be (endstr - strt +1). I was using a positional location instead of character length. That made all the difference.