Search code examples
db2decodedb2-luw

Most efficient way to DECODE multiple columns -- DB2


I am fairly new to DB2 (and SQL in general) and I am having trouble finding an efficient method to DECODE columns

Currently, the database has a number of tables most of which have a significant number of their columns as numbers, these numbers correspond to a table with the real values. We are talking 9,500 different values (e.g '502=yes' or '1413= Graduate Student')

In any situation, I would just do WHERE clause and show where they are equal, but since there are 20-30 columns that need to be decoded per table, I can't really do this (that I know of).

Is there a way to effectively just display the corresponding value from the other table?

Example:

SELECT TEST_ID, DECODE(TEST_STATUS, 5111, 'Approved, 5112, 'In Progress') TEST_STATUS
FROM TEST_TABLE

The above works fine.......but I manually look up the numbers and review them to build the statements. As I mentioned, some tables have 20-30 columns that would need this AND some need DECODE statements that would be 12-15 conditions.

Is there anything that would allow me to do something simpler like:

SELECT TEST_ID, DECODE(TEST_STATUS = *TableWithCodeValues*) TEST_STATUS
FROM TEST_TABLE

EDIT: Also, to be more clear, I know I can do a ton of INNER JOINS, but I wasn't sure if there was a more efficient way than that.


Solution

  • From a logical point of view, I would consider splitting the lookup table into several domain/dimension tables. Not sure if that is possible to do for you, so I'll leave that part.

    As mentioned in my comment I would stay away from using DECODE as described in your post. I would start by doing it as usual joins:

    SELECT a.TEST_STATUS
         , b.TEST_STATUS_DESCRIPTION
         , a.ANOTHER_STATUS
         , c.ANOTHER_STATUS_DESCRIPTION
         , ...
    FROM TEST_TABLE as a
    JOIN TEST_STATUS_TABLE as b
        ON a.TEST_STATUS = b.TEST_STATUS
    JOIN ANOTHER_STATUS_TABLE as c
        ON a.ANOTHER_STATUS = c.ANOTHER_STATUS
    JOIN ...
    

    If things are too slow there are a couple of things you can try:

    • Create a statistical view that can help determine cardinalities from the joins (may help the optimizer creating a better plan):

    https://www.ibm.com/support/knowledgecenter/sl/SSEPGG_9.7.0/com.ibm.db2.luw.admin.perf.doc/doc/c0021713.html

    • If your license admits you can experiment with Materialized Query Tables (MQT). Note that there is a penalty for modifications of the base tables, so if you have more of a OLTP workload, this is probably not a good idea:

    https://www.ibm.com/developerworks/data/library/techarticle/dm-0509melnyk/index.html

    A third option if your lookup table is fairly static is to cache the lookup table in the application. Read the TEST_TABLE from the database, and lookup descriptions in the application. Further improvements may be to add triggers that invalidate the cache when lookup table is modified.