Search code examples
oracle-databaseplsqltrimnvl

Oracle PL/SQL speed of NVL/LENGTH/TRIM calls versus IS NOT NULL AND != ' '


I try to find the best way to check if a CHAR/VARCHAR2 variable contains characters (NULL or spaces should be considered the same, as "no-value"):

I know there are several solutions, but it appears that (NVL(LENGTH(TRIM(v)),0) > 0) is faster than (v IS NOT NULL AND v != ' ')

Any idea why? Or did I do something wrong in my test code?

Tested with Oracle 18c on Linux, UTF-8 db charset ...

I get the following results:

time:+000000000 00:00:03.582731000

time:+000000000 00:00:02.494980000

set serveroutput on;

create or replace procedure test1
is
    ts timestamp(3);
    x integer;
    y integer;
    v char(500);
    --v varchar2(500);
    begin
    ts := systimestamp;
    --v := null;
    v := 'aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa';
    for x in 1..50000000
    loop
        if v is not null and v != ' ' then
           y := x;
        end if;
    end loop;
    dbms_output.put_line('time:' || (systimestamp - ts) ) ;
    end;
/

create or replace procedure test2
is
    ts timestamp(3);
    x integer;
    y integer;
    v char(500);
    --v varchar2(500);
    begin
    ts := systimestamp;
    --v := null;
    v := 'aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa';
    for x in 1..50000000
    loop
        if nvl(length(trim(v)),0) > 0 then
           y := x;
        end if;
    end loop;
    dbms_output.put_line('time:' || (systimestamp - ts) ) ;
    end;
/

begin
    test1();
    test2();
end;
/

drop procedure test1;
drop procedure test2;

quit;

Solution

  • The best practice is to ignore the speed difference between small functions and use whatever is easiest.

    In realistic database programming, the time to run functions like NVL or IS NOT NULL is completely irrelevant compared to the time needed to read data from disk or the time needed to join data. If one function saves 1 seconds per 50 million rows, nobody will notice. Whereas if a SQL statement reads 50 million rows with a full table scan instead of using an index, or vice-versa, that could completely break an application.

    It's unusual to care about these kinds of problems in a database. (But not impossible - if you have a specific use case, then please add it to the question.) If you really need optimal procedural code you may want to look into writing an external procedure in Java or C.