Search code examples
sqlsql-serversql-server-2016

How to list all columns (including schema and table) used in a SQL query


I need to list all of the columns (including schema and table) used in a SQL query.

The reason I am doing this is because I need to document all of the required data that is being used by a shortlist of existing Power BI reports, stored procedures and views. I am then doing a gap analysis to see what data is already available in a new environment and documenting what data needs to be migrated in order to be able to reproduce the reports, stored procedures and views that are in the shortlist.

Input

SELECT FullName, EmailAddress
FROM dbo.Customers t1
LEFT JOIN dbo.Customer_EmailAddress t2 ON t1.CustomerID = t2.CustomerID

Output should be:

dbo.Customers.CustomerID
dbo.Customers.FirstName
dbo.Customer_EmailAddress.CustomerID
dbo.Customer_EmailAddress.EmailAddress

Does anyone know of a way to do this? I need to do this for a large number of queries and I am currently doing it manually which is very time-consuming!


Solution

  • Once you have a query, you can get an execution plan in XML format, E.g. by using the following code.

    SET SHOWPLAN_XML ON;  
    GO
    
    -- your query here
    
    select ...
    
    GO
    
    SET SHOWPLAN_XML OFF;  
    GO
    

    The execution plan contains ColumnReference elements which hold the data that you need (database, schema, table & column), E.g.:

    <ColumnReference Database="[mydb]" Schema="[sys]" Table="[sysschobjs]" Alias="[o]" Column="id">
    

    Insert the XML that you got into a table, E.g.

    create table ExecutionPlan(id int primary key,execution_plan xml)
    
    -- cannot insert the real XML here due to SO post length limitations
    insert into ExecutionPlan values (1, '<ShowPlanXML ...');
    insert into ExecutionPlan values (2, '<ShowPlanXML ...');
    insert into ExecutionPlan values (3, '<ShowPlanXML ...');
    

    When done with the inserts, use the following query to extract the columns


    with xmlnamespaces('http://schemas.microsoft.com/sqlserver/2004/07/showplan' as ns)
    
    select      distinct
    
                ep.id
               ,e.val.value ('@Database' ,'sysname')    as db 
               ,e.val.value ('@Schema'   ,'sysname')    as scm 
               ,e.val.value ('@Table'    ,'sysname')    as tab 
               ,e.val.value ('@Column'   ,'sysname')    as col 
                             
    from        ExecutionPlan as ep
                cross apply execution_plan.nodes ('//ns:ColumnReference') as e (val)     
    
    where       e.val.value ('@Database' ,'sysname') is not null
    
    order by    id, db, scm, tab, col
    

    +----+-----------------------+-------+--------------------+----------+
    | id |          db           |  scm  |        tab         |   col    |
    +----+-----------------------+-------+--------------------+----------+
    |  1 | [mydb]                | [dbo] | [mytable]          | i        |
    |  1 | [mydb]                | [dbo] | [mytable]          | v        |
    |  2 | [mydb]                | [dbo] | [mytable]          | i        |
    |  2 | [mydb]                | [dbo] | [mytable]          | v        |
    |  3 | [mssqlsystemresource] | [sys] | [syspalnames]      | class    |
    |  3 | [mssqlsystemresource] | [sys] | [syspalnames]      | name     |
    |  3 | [mssqlsystemresource] | [sys] | [syspalnames]      | value    |
    |  3 | [mssqlsystemresource] | [sys] | [syspalvalues]     | class    |
    |  3 | [mssqlsystemresource] | [sys] | [syspalvalues]     | name     |
    |  3 | [mssqlsystemresource] | [sys] | [syspalvalues]     | value    |
    |  3 | [mydb]                | [sys] | [sysidxstats]      | id       |
    |  3 | [mydb]                | [sys] | [sysidxstats]      | indid    |
    |  3 | [mydb]                | [sys] | [sysidxstats]      | intprop  |
    |  3 | [mydb]                | [sys] | [sysmultiobjrefs]  | class    |
    |  3 | [mydb]                | [sys] | [sysmultiobjrefs]  | depid    |
    |  3 | [mydb]                | [sys] | [sysobjvalues]     | objid    |
    |  3 | [mydb]                | [sys] | [sysobjvalues]     | valclass |
    |  3 | [mydb]                | [sys] | [sysobjvalues]     | valnum   |
    |  3 | [mydb]                | [sys] | [sysobjvalues]     | value    |
    |  3 | [mydb]                | [sys] | [sysschobjs]       | created  |
    |  3 | [mydb]                | [sys] | [sysschobjs]       | id       |
    |  3 | [mydb]                | [sys] | [sysschobjs]       | intprop  |
    |  3 | [mydb]                | [sys] | [sysschobjs]       | modified |
    |  3 | [mydb]                | [sys] | [sysschobjs]       | name     |
    |  3 | [mydb]                | [sys] | [sysschobjs]       | nsclass  |
    |  3 | [mydb]                | [sys] | [sysschobjs]       | nsid     |
    |  3 | [mydb]                | [sys] | [sysschobjs]       | pclass   |
    |  3 | [mydb]                | [sys] | [sysschobjs]       | pid      |
    |  3 | [mydb]                | [sys] | [sysschobjs]       | status   |
    |  3 | [mydb]                | [sys] | [sysschobjs]       | status2  |
    |  3 | [mydb]                | [sys] | [sysschobjs]       | type     |
    |  3 | [mydb]                | [sys] | [syssingleobjrefs] | class    |
    |  3 | [mydb]                | [sys] | [syssingleobjrefs] | depid    |
    |  3 | [mydb]                | [sys] | [syssingleobjrefs] | depsubid |
    |  3 | [mydb]                | [sys] | [syssingleobjrefs] | indepid  |
    +----+-----------------------+-------+--------------------+----------+
    

    Fiddle

    P.S.
    You would probably want to automate the process by writing a code (e.g. in Python) that connect to your database, pick a query, get its execution plan and save it into a table.