I have a denormalized table that I'm attempting to evaluate for hierarchical issues. The correct format is a top down hierarchy structure for a large organization. Conceptually, everyone should have just one reporting manager, but since it's denormalized, this isn't always the case. I want to find all of the records where a particular employee has more than one reporting manager. Here's some example data:
LVL1_First_Name|LVL1_Last_Name|LVL1_Emp_ID|LVL2_First_Name|LVL2_Last_Name|LVL2_Emp_ID|LVL3_First_Name|LVL3_Last_Name|LVL3_Emp_ID
---------------+--------------+-----------+---------------+--------------+-----------+---------------+--------------+-----------
Jack |Gorshon |0001 |Henry |Johnson |0003 |Amy |Taylor |0005
Robert |Harris |0002 |Henry |Johnson |0003 |Nick |Greer |0006
Robert |Harris |0002 |Jim |Balial |0004 |Josh |Wolfe |0007
As you can see, there is an issue with the hierarchy as Henry Johnson reports to both Robert and Jack. I'm looking to create a SQL statement that would pull out the first two records of this table because they don't follow a true top down hierarchy.
I've attempted to solve this with a group by ... having statement but it isn't getting me the results I need.
Note: I'm not in charge of this data format. My goal is to sanitize it and clean it up before putting it into a proper, normalized data structure.
Hmmm . . . you can unpivot the data to normalize it and then aggregate:
select v.emp_id, min(lev), max(lev)
from t cross apply
(values (LVL1_Emp_ID, LVL2_Emp_ID, 2), (LVL1_Emp_ID, LVL3_Emp_ID, 3)
) v(parent_emp_id, emp_id, lev)
from v.emp_id
group by v.emp_id
having min(parent_emp_id) <> max(parent_emp_id);