Search code examples
sqlpostgresqlrelational-division

Nested queries on a single table


I'm having some trouble with a query where I have a table where I want to pick out items that have been marked with different fields.

Specifically, boxes on a conveyor that have moved around and been marked at different locations.

Below is what I have (it doesnt work - returns multiple records in inner select clause causing an error).

The field barcode is not unique. The way it is supposed to work is it will go around and for each time it passes certain areas a new record will be added for that barcode. I want to find the barcodes that have been to 3 different areas with one of them being lane_id LIKE(CRANE%_IN)

I know my error is a logical one but I'm having trouble getting my head around queries on the same table with sub queries.

As easy as the concept is (and I could do it in any language in 2 minutes) it needs to be done in SQL (postgres).

SELECT barcode, lane_id
FROM acts_divert_log
WHERE lane_id IN ('GTP01', 'GTP02', 'GTP03', 'GTP04')
AND divstat IN (0, 1)
AND barcode =
(
    SELECT barcode FROM acts_divert_log WHERE
    lane_id LIKE ('CRANE%_IN') 
    AND divstat IN (0, 1) 
    AND barcode =
    (
        SELECT barcode FROM acts_divert_log WHERE
        lane_id IN ('GTP01', 'GTP02', 'GTP03', 'GTP04')
        AND divstat IN (0, 1)
    ) 
);

The expected results should be unique barcodes which have been seen in 2 of the GTP0% areas and at least one CRANE%_IN area.

Any help or hints will be much appreciated. SQL really isn't my forte.


Solution

  • It's a case of relational division. There are many possible ways to solve this. One simple and fast approach:

    SELECT barcode
         , a1.lane_id AS lane_id1
         , a2.lane_id AS lane_id2
         , a3.lane_id AS lane_id3
    FROM   acts_divert_log a1
    JOIN   acts_divert_log a2 USING (barcode)
    JOIN   acts_divert_log a3 USING (barcode)
    WHERE  a1.lane_id IN ('GTP01', 'GTP02', 'GTP03', 'GTP04')
    AND    a2.lane_id IN ('GTP01', 'GTP02', 'GTP03', 'GTP04')
    AND    a3.lane_id LIKE 'CRANE%_IN'
    AND    a1.divstat IN (0, 1)
    AND    a2.divstat IN (0, 1)
    AND    a3.divstat IN (0, 1)
    AND    a1.lane_id <> a2.lane_id;  -- assuming you want distinct lane_ids
    

    We assembled and arsenal of techniques under this related question:

    You need matching indexes to make this fast. Details depend on undisclosed information about table definition, data distribution, cardinalities etc.