I have a table which represents a line-by-line dump of the data read from a particular text file format. Each line may represent a "master" or a "detail" line, indicated via rec_type
code. I'd like to write a query that gets the "master" lines alongside the associated detail lines. I've come up with something that does the job, but it seems a bit hackish and am interested in better ways if any.
CREATE TABLE mdtest
(rec_seq NUMBER PRIMARY KEY
,rec_type VARCHAR2(3) NOT NULL
,rec_data VARCHAR2(100) NOT NULL);
INSERT INTO mdtest VALUES (1, '100', 'Bill Jones');
INSERT INTO mdtest VALUES (2, '200', '20080115,100.25');
INSERT INTO mdtest VALUES (3, '100', 'John Smith');
INSERT INTO mdtest VALUES (4, '200', '20090701,80.95');
INSERT INTO mdtest VALUES (5, '200', '20091231,110.35');
Desired Result:
SEQ_EMP EMP_NAME SEQ_DATA EMP_DATA
======= ========== ======== ===============
1 Bill Jones 2 20080115,100.25
3 John Smith 4 20090701,80.95
3 John Smith 5 20091231,110.35
Assumptions:
100
"100
" record has 1 or more "200
" records following Note: this is for Oracle 9i, however we should be upgrading to 11g R1 this year.
Here's what I have so far:
SELECT seq_emp
,SUBSTR(emp_seq_name,10) emp_name
,seq_data
,emp_data
FROM (SELECT MAX(CASE WHEN rec_type = '100' THEN rec_seq END)
OVER (ORDER BY rec_seq
ROWS BETWEEN UNBOUNDED PRECEDING
AND CURRENT ROW) seq_emp
,MAX(CASE
WHEN rec_type = '100'
THEN TO_CHAR(rec_seq,'fm00000000') || '|' || rec_data
END)
OVER (ORDER BY rec_seq
ROWS BETWEEN UNBOUNDED PRECEDING
AND CURRENT ROW) emp_seq_name
,rec_seq seq_data
,rec_type
,rec_data emp_data
FROM mdtest)
WHERE rec_type = '200'
ORDER BY seq_data;
As you can see, I'm using the MAX reporting analytic function with a window starting from the top of the set down to the current row, to get the relevant "100" record for the current "200" record; then in the outer query I'm discarding the unneeded "100" records.
To get emp_name, I've had to append the rec_seq with the data so that the MAX function still picks the correct header record; then in the outer query I chop the rec_seq off.
I've played with other analytic functions and syntax including FIRST_VALUE and the KEEP syntax, but neither of these seem to make this job simpler; the difficulty is that the window is defined by the value of rec_type instead of being some constant offset.