Search code examples
pythonsqlexcelreport

2 SQL queries interlaced on ID number


I have a 2 queries that will be run repetitively to feed a report and some charts so need to make sure it is tight. First query has 25 columns and will yield out 25-50 rows from a massive table. My second query will result in another 25 columns (a couple matching columns) of 25 to 50 rows from another massive table.

Desired end result is a single document in that Query 1 (Problem) and Query 2 (Problem tasks) could match on a common column (Problem ID) so that row 1 is the problem, row 2-4 is the tasks, row 5 is the next problem and 6-9 are the tasks....ect. Now I realize I could do this manually by running the 2 queries and them just combining them in excel by hand, but looking for a eloquent process that could be reusable in my absence without too much overhead.

I was exploring inserts, union all, and cross join but the 2 queries have different columns that contain different critical data elements to be returned. Also, exploring setting up a Python job to do this by importing the CSVs and interlacing results but I am a early data science student and not yet much past creating charts from imported CSVs.

Any suggestions on how I might attack this challenge? Thanks for the help. Picture of desired end result. enter image description here


Solution

  • You can do it with something like

    INSERT INTO target_table (<columns...>)
      SELECT <your first query> 
      UNION
      SELECT <your second query>
    

    And then to retrieve data

    SELECT * from target_table 
      WHERE <...>
      ORDER BY problem_id, task_id
    

    Just ensure both queries return the same columns, i.e. the columns you want to populate in target_table, probably using fixed default values (e.g. the first query may return a default task_id by including NULL as task_id in the column list)