Search code examples
informix

How can I filter character values in an Informix database that contain a timestamp by year and month?


I use an Informix database with the version "IBM Informix Dynamic Server Version 14.10".

I have a column with char(20). We can call the column "my_example_column". All values of this column contain a timestamp.

This looks like this:

ABC_202001171317325

The first 4 characters "ABC_" are not relevant for me.

However, the timestamp is relevant for me. The first 4 numbers "2020" represent the year. The following 2 numbers "01" represent the month. The following 2 numbers "17" represent the day.

So the date would be January 17, 2020.

I want to filter the values in this named column "my_example_column" by year and month.

How can I do this?


Solution

  • Using IBM Informix Dynamic Server Version 14.10.FC10 we can try the following:

    -- Create the table
    CREATE TABLE my_example_table
    (
        my_example_column CHAR(20)
    );
    
    -- Populate the table with 9999 rows, DBDATE env variable is set to Y4MD-
    INSERT INTO my_example_table( my_example_column )
    SELECT 'ABC_' || TO_CHAR( DATE( '2000-01-01' ) + ( LEVEL::CHAR(20)::INTERVAL DAY(6) TO DAY ), '%Y%m%d' ) || '0101001' AS datestring
    FROM sysmaster:sysdual
    CONNECT BY LEVEL < 10000;
    
    -- Create an index on the column
    CREATE INDEX my_example_table_idx1 ON my_example_table( my_example_column );
    
    -- Update row distribution for the table ( create index already does this for the column )
    UPDATE STATISTICS HIGH FOR TABLE my_example_table;
    

    Using the substring operator to filter the results:

    SELECT *
    FROM my_example_table
    WHERE
        my_example_column[5,8] = '2001'
    AND my_example_column[9,10] = '02';
    

    This query will return what we want, but it will do a sequencial scan of the table ( full table scan ), ignoring the index.

    If we change the query to use the LIKE operator, we have:

    SELECT *
    FROM my_example_table
    WHERE
        my_example_column LIKE 'ABC\_2001%'
    AND my_example_column LIKE 'ABC\_____02%'
    ;
    

    This query will make use of the index, but only because the 1st 4 characters are a constant ( that is the reason I am escaping the "_" on the filters ).

    We can also make use of a functional index, but have not tested if it is more performant. To do that we need to define a function and use it to build the index.

    CREATE FUNCTION get_my_year( in_my_example_column CHAR( 20 ) )
    RETURNING CHAR( 4 ) AS year
    WITH( NOT VARIANT );
    
     RETURN in_my_example_column[5,8];
     
    END FUNCTION;
    
    
    CREATE FUNCTION get_my_month( in_my_example_column CHAR( 20 ) )
    RETURNING CHAR( 2 ) AS month
    WITH( NOT VARIANT );
    
     RETURN in_my_example_column[9,10];
     
    END FUNCTION;
    
    CREATE INDEX my_example_table_idx2 ON my_example_table( get_my_year( my_example_column ) );
    CREATE INDEX my_example_table_idx3 ON my_example_table( get_my_month( my_example_column ) );
    

    I used 2 functions to create 2 functional indexes ( which is probably overkill, 1 index should be enough ). Or we can create a single composite index:

    CREATE INDEX my_example_table_idx4 ON my_example_table( get_my_year( my_example_column ), get_my_month( my_example_column ) );
    

    Now we need a query that uses the functions utilized to create the indexes:

    SELECT *
    FROM my_example_table
    WHERE
        get_my_year( my_example_column ) = '2001'
    AND get_my_month( my_example_column ) = '02'
    ;
     
    

    Using the functional indexes we can ignore the 1st 4 characters of the column ( remember, they need to be constant for the LIKE query to use the index ).