Search code examples
sqloracle-databaseoracle9ianalytic-functions

Oracle analytic function window defined by data


I have a table which represents a line-by-line dump of the data read from a particular text file format. Each line may represent a "master" or a "detail" line, indicated via rec_type code. I'd like to write a query that gets the "master" lines alongside the associated detail lines. I've come up with something that does the job, but it seems a bit hackish and am interested in better ways if any.

CREATE TABLE mdtest
 (rec_seq  NUMBER        PRIMARY KEY
 ,rec_type VARCHAR2(3)   NOT NULL
 ,rec_data VARCHAR2(100) NOT NULL);

INSERT INTO mdtest VALUES (1, '100', 'Bill Jones');
INSERT INTO mdtest VALUES (2, '200', '20080115,100.25');
INSERT INTO mdtest VALUES (3, '100', 'John Smith');
INSERT INTO mdtest VALUES (4, '200', '20090701,80.95');
INSERT INTO mdtest VALUES (5, '200', '20091231,110.35');

Desired Result:

SEQ_EMP  EMP_NAME    SEQ_DATA  EMP_DATA
=======  ==========  ========  ===============
      1  Bill Jones         2  20080115,100.25
      3  John Smith         4  20090701,80.95
      3  John Smith         5  20091231,110.35

Assumptions:

  • records are processed in sequence of rec_seq
  • first record type is a "100"
  • each "100" record has 1 or more "200" records following

Note: this is for Oracle 9i, however we should be upgrading to 11g R1 this year.


Solution

  • Here's what I have so far:

    SELECT seq_emp 
          ,SUBSTR(emp_seq_name,10) emp_name 
          ,seq_data 
          ,emp_data 
    FROM  (SELECT MAX(CASE WHEN rec_type = '100' THEN rec_seq END) 
                  OVER (ORDER BY rec_seq 
                        ROWS BETWEEN UNBOUNDED PRECEDING 
                                 AND CURRENT ROW) seq_emp 
                 ,MAX(CASE 
                      WHEN rec_type = '100' 
                      THEN TO_CHAR(rec_seq,'fm00000000') || '|' || rec_data 
                      END) 
                  OVER (ORDER BY rec_seq 
                        ROWS BETWEEN UNBOUNDED PRECEDING 
                                 AND CURRENT ROW) emp_seq_name 
                 ,rec_seq seq_data 
                 ,rec_type 
                 ,rec_data emp_data 
           FROM   mdtest) 
    WHERE  rec_type = '200' 
    ORDER BY seq_data; 
    

    As you can see, I'm using the MAX reporting analytic function with a window starting from the top of the set down to the current row, to get the relevant "100" record for the current "200" record; then in the outer query I'm discarding the unneeded "100" records.

    To get emp_name, I've had to append the rec_seq with the data so that the MAX function still picks the correct header record; then in the outer query I chop the rec_seq off.

    I've played with other analytic functions and syntax including FIRST_VALUE and the KEEP syntax, but neither of these seem to make this job simpler; the difficulty is that the window is defined by the value of rec_type instead of being some constant offset.