I am working on the tree hierarchy of supervisors and their supervised employees. The difficulty is that some supervisors are employees supervised by other supervisors, and there are lots of it.
For SQL queries I acquired from class, only about simple self-joins, which might be only like two levels: A is supervised by B, and that's it.
But the issue from the real world is far more complicated. There are multiple levels, and I am not sure about the exact number. For example, A is supervised by B, and B is supervised by C, and C is supervised by D, etc.
I assume there are only like 4 levels for supervision. The raw data might be like this:
Employee Supervisor
A B
C B
D B
B V
E V
F E
G V
V (Blank which indicates no boss)
And thus I generated the codes like below:
SELECT T1.Supervisor,ifnull(T2.EmployeeName, 'No Boss') as sup_emp1,
ifnull(T3.EmployeeName, 'No Boss') as sup_emp2, ifnull(T4.EmployeeName,
'No boss') as sup_emp3, ifnull(T5.EmployeeName, 'No Boss') as EMPLOYEE
FROM T1
LEFT JOIN T2
ON T1.Supervisor=T2.EmployeeName
LEFT JOIN T3
ON T2.Supervisor = T3.EmployeeName
LEFT JOIN T4
ON T3.Supervisor = T4.EmployeeName
LEFT JOIN T5
ON T4.Supervisor = T5.EmployeeName
WHERE T5. Emp_Status = 'Active'
ORDER BY T1.Supervisor ASC
However, the results turned out to be wrong. I just got so confused about the tables the relationship and failed to think of the recursive joins.
The expected result should be like this:
Supervisor # of Employees
V 7(include indirect supervised employees)
B 3
E 1
And the organization table should be:
Supervisor Employee1 Employee2
V B A
V B C
V B D
V E F
V G Null
But my result turned out to be
Supervisor Employee1 Employee2
V B A
V B C
V B D
V E F
V V G
<- I need the G to be on the second level.
So, can anyone help? Thanks!
... assume there are only like 4 levels for supervision ...
Below is for BigQuery Standard SQL and easily extended to more (reasonably more) levels just by adding more LEFT JOINs and respective fields into COALESCE
#standardSQL
SELECT t.Supervisor,
IF(t.Supervisor = t5.Supervisor,
STRUCT(Employee2 AS Employee1, NULL AS Employee2),
STRUCT(t5.Supervisor AS Employee1, Employee2 AS Employee2)
).*
FROM (
SELECT t1.Employee Supervisor,
COALESCE(t4.Employee, t3.Employee, t2.Employee) Employee2
FROM `project.dataset.table` t1
LEFT JOIN `project.dataset.table` t2 ON t2.Supervisor = t1.Employee
LEFT JOIN `project.dataset.table` t3 ON t3.Supervisor = t2.Employee
LEFT JOIN `project.dataset.table` t4 ON t4.Supervisor = t3.Employee
WHERE t1.Supervisor IS NULL
) t
LEFT JOIN `project.dataset.table` t5 ON t5.Employee = t.Employee2
If to apply to sample data from your question - result is
Row Supervisor Employee1 Employee2
1 V B A
2 V B C
3 V B D
4 V E F
5 V G null