Search code examples
sqlapache-spark-sqlsubqueryunionohdsi-omop

How can I pass the results of a query/subquery to another query, and UNION the results together?


ANSWERED! CTE is the best method, using descriptive dummyvariable names

I have two relevant tables concept which describes all of the database wide concepts (concept_code) from different vocabularies (vocabulary_id), and relates them to a unique, standardized ID concept_id, and concept_relationship which only has three columns, concept_id_1, concept_id_2, and relationship_id. Relationship ID describes how ID one relates to ID 2. The database team attempts to standardize concepts into one ID as much as possible using the OMOP standards, so many IDs that are at source from one vocabulary get remapped to their corresponding parent standardized ID using the "Maps to" relationship. I have sets of codes which correspond to IDs that were defined by subject matter experts in specific vocabularies that don't necessarily include what the database team chose as their standardized concept. The end product I want is a table representing both the codes/IDs defined by the SMEs, and the corresponding IDs that they "map to".

########################################################################

The new code that seems to be behaving as intended uses a WITH statement:


WITH query1 AS (
  SELECT 
    * 
  FROM 
    concept 
  WHERE 
    concept_code IN ("398.4", "I45")
    AND (
      vocabulary_id IN ("ICD9","ICD10")
    )
)
SELECT *
FROM concept
WHERE concept_id IN (
    SELECT concept_id_2
    FROM concept_relationship cr
    INNER JOIN query1 q1 ON cr.concept_id_1 = q1.concept_id
    WHERE relationship_id = "Maps to"
)
UNION 
SELECT * FROM query1

########################################################################

I have tried the query below:

SELECT 
  foo.* 
FROM 
  (
    SELECT 
      * 
    FROM 
      concept 
    WHERE 
      concept_code IN ("398.4", "I45") 
      AND (
        vocabulary_id IN ("ICD9", "ICD10")
      )
  ) AS foo 
UNION 
SELECT 
  * 
FROM 
  concept 
WHERE 
  concept_id IN (
    SELECT 
      concept_id_2 
    FROM 
      concept_relationship cr 
      INNER JOIN foo ON cr.concept_id_1 = foo.concept_id 
    WHERE 
      relationship_id = "Maps to"
  )


But it returns an "INVALID ARGUMENT" error, I'm guessing because I can't pass the subquery foo above below the UNION to the other query. Is there a smoother way to do this? I've included some dummy tables below to attempt reproducibility, which seems to work database-side, but all columns are not included for conciseness.

concept :

| concept_id  | concept_code | vocabulary_id |
| ----------- | ------------ |---------------|
| 123         | 398.4        | ICD9          |
| 111         | I45          | ICD10         |
| 145         | 45155841     | SNOMED        |

concept_relationship:

| concept_id_1 | concept_id_2 | relationship_id|
| -----------  | ------------ |--------------- |
| 123          | 145          | Maps to        |
| 111          | 145          | Maps to        |
| 145          | 145          | Maps to        |
| 145          | 111          | Maps from      |
| 145          | 123          | Maps from      |
| 145          | 145          | Maps from      |
CREATE TABLE `concept` (
  `concept_id` VARCHAR NOT NULL, 
  `concept_code` VARCHAR NOT NULL, 
  `vocabulary_id` VARCHAR NOT NULL, 
  PRIMARY KEY (`concept_id`)
);

INSERT INTO concept (
  concept_id, concept_code, vocabulary_id
) 
VALUES 
  ("123", "388.4", "ICD9"), 
  ("111", "I45", "ICD10"), 
  ("145", "45155841 ", "SNOMED");


CREATE TABLE `concept` (
  `concept_id_1` VARCHAR NOT NULL, 
  `concept_id_2` VARCHAR NOT NULL, 
  `relationship_id` VARCHAR NOT NULL, 
  PRIMARY KEY (`concept_id_1`)
);

INSERT INTO concept_relationship (
  concept_id_1, concept_id_2, relationship_id
) 
VALUES 
  ("123", "145", "Maps to"), 
  ("111", "145", "Maps to"), 
  ("145", "145", "Maps to"), 
  ("145", "111", "Maps from"), 
  ("145", "123", "Maps from"), 
  ("145", "145", "Maps from");


Solution

  • The problem is you use foo on the other side of the union and it can't see it. but a CTE can help like this

    WITH foo AS (
      SELECT * 
      FROM concept 
      WHERE concept_code IN ("398.4", "I45") AND
            vocabulary_id IN ("ICD9", "ICD10")
    )
    
    SELECT *
    FROM foo
    
    UNION 
    
    SELECT * 
    FROM concept 
    WHERE concept_id IN (
        SELECT concept_id_2 
        FROM concept_relationship cr 
        JOIN foo ON cr.concept_id_1 = foo.concept_id 
        WHERE relationship_id = "Maps to"
    )
    

    note the 2nd query after the UNION can be re-written as

    SELECT * 
    FROM concept 
    JOIN concept_relationship cr ON cr.concept_id_2 = concept.conecpt_id
                                AND cr.relationship_id = "Maps to"
    JOIN foo on cr.concept_id_1 = foo.concept_id
    

    using joins.