Search code examples
sqlssms

Why do I have multiple entries per entity in the query output?


Would like to know why my query displays multiple entries per entity in the output. From what I understand there is only one active policy per entity.

Created query with SQL Server Management Studio, my output to display correctly has parameters, and I have tried the following with my query.

Currently my SQL SSMS query output displays the following:

Entity_Number  Building_Name PolicyID  Description         Start_Date   End_Date
 400           Xpress         4         5 Day Grace        7/1/2019     9/27/2019
 400           Xpress         18        2 Day Grace        7/3/2018     7/13/2018
 400           Xpress         19        4 Day Grace        2/27/2019    2/27/2019

What I really would like to know is how do I drill down and find out why my query returns multiples?

[Query]

SELECT
e.Entity_Number,
bld.Building_Name,
cbp.PolicyId, 
cbp.Description,  
cbp.StartDate, 
cbp.EndDate

FROM            
dbo.buildings AS bld
INNER JOIN dbo.entities AS e
ON bld.Entity_ID = e.Entity_ID 
INNER JOIN Collections.Building AS cbp
ON bld.Building_ID = cb.BuildingId
INNER JOIN Collections.BuildingProfile AS cbpro
ON cbp.BuildingPolicyId = cbpro.BuildingPolicyId

WHERE
bld.Building_Active = 1
AND e.Active = 1

Solution

  • Use the "salami technique" to isolate where the unexpected rows come from. What I mean by this is that you cut down the query like a salami by omitting each join (and any column references related to that join) one by one.

    e.g. start with masking the join to Collections.BuildingProfile:

    SELECT
          e.Entity_Number
        , bld.Building_Name
        , cbp.PolicyId
        , cbp.Description
        , cbp.StartDate
        , cbp.EndDate
    FROM dbo.buildings AS bld
    INNER JOIN dbo.entities AS e ON bld.Entity_ID = e.Entity_ID
    INNER JOIN Collections.Building AS cbp ON bld.Building_ID = cbp.BuildingId
    -- INNER JOIN Collections.BuildingProfile AS cbpro ON cbp.BuildingPolicyId = cbpro.BuildingPolicyId
    WHERE bld.Building_Active = 1
        AND e.Active = 1
    

    Does this remove the unexpected columns? If not then try:

    SELECT
          e.Entity_Number
        , bld.Building_Name
        --, cbp.PolicyId
        --, cbp.Description
        --, cbp.StartDate
        --, cbp.EndDate
    FROM dbo.buildings AS bld
    INNER JOIN dbo.entities AS e ON bld.Entity_ID = e.Entity_ID
    --INNER JOIN Collections.Building AS cbp ON bld.Building_ID = cbp.BuildingId
    --INNER JOIN Collections.BuildingProfile AS cbpro ON cbp.BuildingPolicyId = cbpro.BuildingPolicyId
    WHERE bld.Building_Active = 1
        AND e.Active = 1
    

    Eventually by masking out each join (and any related column references to that table) you will discover which table is producing the unexpected multiplication of rows.

    Once that table is identified I suggest you reconsider all assumptions you have made about how that table had been joined. For example, you state that " From what I understand there is only one active policy per entity." Is that really true?

    Once you know where the problem starts, and you reconsider how that data should actually be used within the query, you should be closer to a solution. e.g. perhaps you need more conditions in the join, or you need to join a subquery instead of directly to the table.

    Note:

    • Collections.BuildingProfile does not seem needed by the query, why not omit it anyway?
    • reformatting for "comma first" in the select clause helps simplify use of the "salami technique"