We have multiple tables with similar structures, and we consolidate them into covering views using UNION ALL statements. The source tables do not have exact column names, or data types. The tables/views are wide enough and with enough UNIONs such that it makes updating and maintaining the views a nightmare. To taht end I have built a meta-data driven dynamic SQL process to ease this maintenance going forward.
Having built the meta-data tables and procedure, I do not wish to unleash it upon my colleagues without adding the meta-data, nor do I want to populate it all myself by hand. Therefore I was hoping to extract the column data types and ordinality from system views/procs/funcs via a one-time script.
Is it at all possible to retrieve this information programatically?
Example view:
CREATE VIEW info.vw_ActivitySummary AS
SELECT
Financial_Year AS FINANCIAL_YEAR
, CAST(NULL AS INT) AS IMPORT_ID
, Monthly_Activity AS ACTIVITY_VALUE
FROM info.Activity_2324
UNION ALL SELECT
FINANCIAL_YEAR AS FINANCIAL_YEAR
, ImportFlowNumber AS IMPORT_ID
, CAST(NULL AS DECIMAL(38,19)) AS ACTIVITY_VALUE
FROM info.Activity_2223
Research led me to try this DMF:
SELECT * FROM sys.dm_sql_referenced_entities('info.vw_ActivitySummary', 'object')
At first I was hopeful, but this led me to a dead-end simply because there is no link between the ordinality of the output columns and the referenced columns that feed them. True, there is a link in the column order from the [referenced_minor_id] column, but other than their order, the numbers are contiguous and do not break for self-contained calculated columns (such as my CAST(NULL...)s as shown in the example).
Edit
Since there's already an answer that indicates question needs clarity, let me add:
Imagine a set of tables:
info.CoveringView
info.CoveringViewColumns
info.CoveringViewColumnsTableColumns
with cascading one-to-many relationships such that the following view would be possible:
CREATE VIEW info.CoveringViewSpecifications AS
SELECT v.VIEW_SCHEMA,
v.VIEW_NAME,
c.VIEW_COLUMN_NAME,
c.ORDINAL_POSITION,
c.DATA_TYPE_AND_SIZE,
s.TABLE_SCHEMA,
s.TABLE_NAME,
s.TABLE_COLUMN_NAME
FROM info.CoveringView v
INNER JOIN info.CoveringViewColumns c
ON c.CoveringViewId = v.Id
INNER JOIN info.CoveringViewColumnsTableColumns s
ON s.CoveringViewColumnsId = c.Id;
And from this one can write a function that creates individual select statements that have the columns ordered correctly, with appropriate CAST(NULL...).. statements such that when UNION (ALL)'d together they form a usable covering view, all data types cast explicitly, etc.
Now imagine there's already a number of views we want to find meta-data for that have around 30-40 UNIONs in, so, can I generate that meta-data from the existing views programatically?
I still hold out hope there is a more elegant way to handle this, however I settled on parsing the full view definition and using string manipulation:
SET NOCOUNT ON;
DECLARE @UnionQuery NVARCHAR(MAX) = OBJECT_DEFINITION(OBJECT_ID(N'dbo.acm'));
-- This should be 'UNION' or 'UNION ALL' as required. Include it exactly as
-- it would appear in your query, or add extra string manipulation before
-- the replace
DECLARE @UnionText NVARCHAR(20) = N'UNION ALL';
-- Choose some single character that is guaranteed not to be in your object
-- definition. This shotgun character is usually a safe bet.
-- (STRING_SPLIT only accepts single-character delimiters)
DECLARE @Delimiter NCHAR(1) = N'¬';
-- Line endings are easier to read as a variable than two concatenated CHAR
-- functions
DECLARE @CrLf NCHAR(2) = CHAR(13) + CHAR(10);
-- Tab characters, for readability, same as above
DECLARE @Tab NCHAR(1) = CHAR(9);
-- Remove CREATE ... AS parts
SET @UnionQuery =
REPLACE(
@UnionQuery
, SUBSTRING(
@UnionQuery
, CHARINDEX('CREATE', @UnionQuery)
, CHARINDEX('AS', @UnionQuery, CHARINDEX('CREATE', @UnionQuery))
- CHARINDEX('CREATE', @UnionQuery) + 2
)
, ''
);
-- Remove single-line comments
WHILE CHARINDEX('--', @UnionQuery) > 0
BEGIN
SET @UnionQuery =
REPLACE(
@UnionQuery
, SUBSTRING(
@UnionQuery
, CHARINDEX('--', @UnionQuery)
, CHARINDEX(@CrLf, @UnionQuery, CHARINDEX('--', @UnionQuery))
- CHARINDEX('--', @UnionQuery) + 2
)
, ''
);
END;
-- Remove multi-line comments
WHILE CHARINDEX('/*', @UnionQuery) > 0
BEGIN
SET @UnionQuery =
REPLACE(
@UnionQuery
, SUBSTRING(
@UnionQuery
, CHARINDEX('/*', @UnionQuery)
, CHARINDEX('*/', @UnionQuery, CHARINDEX('/*', @UnionQuery))
- CHARINDEX('/*', @UnionQuery) + 2
)
, ''
);
END;
-- Replace UNION part with single-character delimiter for STRING_SPLIT
SET @UnionQuery = REPLACE(@UnionQuery, @UnionText, @Delimiter);
-- We can store our results here
DROP TABLE IF EXISTS #ParsedColumns;
CREATE TABLE #ParsedColumns (
SourceTable NVARCHAR(128) NOT NULL
, ColumnName NVARCHAR(128) NOT NULL
, OrdinalPosition INT NOT NULL
);
-- Split the query into individual SELECT statements
DECLARE @Queries AS TABLE (
ID INT IDENTITY(1, 1)
, Query NVARCHAR(MAX) NULL
);
INSERT INTO @Queries (
Query
)
SELECT value
FROM STRING_SPLIT(@UnionQuery, @Delimiter);
-- Parse each SELECT statement
DECLARE @Index INT = 1;
DECLARE @CurrentQuery NVARCHAR(MAX);
WHILE EXISTS (SELECT 1 FROM @Queries WHERE ID = @Index)
BEGIN
SELECT @CurrentQuery = Query
FROM @Queries
WHERE ID = @Index;
-- Extract table name from the FROM clause
-- (simple version, assumes single table in each SELECT)
DECLARE @TableName NVARCHAR(128) =
(LTRIM(
RTRIM(
SUBSTRING(
@CurrentQuery
, CHARINDEX('FROM', @CurrentQuery) + 5
, CHARINDEX(
' '
, @CurrentQuery + ' '
, CHARINDEX('FROM', @CurrentQuery) + 5
) - CHARINDEX('FROM', @CurrentQuery) - 5
)
)
)
);
-- Remove SELECT and FROM clause for easier column extraction
DECLARE @ColumnPart NVARCHAR(MAX) =
LTRIM(
RTRIM(
REPLACE(
SUBSTRING(
@CurrentQuery
, CHARINDEX('SELECT', @CurrentQuery) + 6
, CHARINDEX('FROM', @CurrentQuery)
- CHARINDEX('SELECT', @CurrentQuery) - 6
)
, 'SELECT'
, ''
)
)
);
-- Extract column names and insert into temporary table
DECLARE @OrdinalPosition INT = 1;
WHILE CHARINDEX(',', @ColumnPart) > 0
BEGIN
DECLARE @ColumnName NVARCHAR(128) =
LTRIM(
RTRIM(
SUBSTRING(@ColumnPart, 1, CHARINDEX(',', @ColumnPart) - 1)
)
);
SET @ColumnPart =
SUBSTRING(
@ColumnPart
, CHARINDEX(',', @ColumnPart) + 1
, LEN(@ColumnPart)
);
INSERT INTO #ParsedColumns (
SourceTable
, ColumnName
, OrdinalPosition
)
VALUES
(REPLACE(REPLACE(@TableName, @CrLf, ''), @Tab, '')
, REPLACE(REPLACE(@ColumnName, @CrLf, ''), @Tab, '')
, @OrdinalPosition);
SET @OrdinalPosition = @OrdinalPosition + 1;
END;
-- Insert the last column
INSERT INTO #ParsedColumns (
SourceTable
, ColumnName
, OrdinalPosition
)
VALUES
(REPLACE(REPLACE(@TableName, @CrLf, ''), @Tab, '')
, REPLACE(REPLACE(LTRIM(RTRIM(@ColumnPart)), @CrLf, ''), @Tab, '')
, @OrdinalPosition);
-- Move to the next SELECT statement
SET @Index = @Index + 1;
END;
The resulting temp table can now be used to feed my meta-data tables:
-- Retrieve and display the parsed columns
SELECT PARSENAME(pc.SourceTable, 2) AS SOURCE_SCHEMA
, PARSENAME(pc.SourceTable, 1) AS SOURCE_NAME
, pc.OrdinalPosition
, c.COLUMN_NAME
, PARSENAME(pc.ColumnName, 1) AS COLUMN_NAME
, pc.ColumnName AS RAW_COLUMN
FROM
#ParsedColumns pc
LEFT JOIN INFORMATION_SCHEMA.COLUMNS c
ON c.TABLE_SCHEMA = PARSENAME(pc.SourceTable, 2)
AND c.TABLE_NAME = PARSENAME(pc.SourceTable, 1)
AND c.COLUMN_NAME = PARSENAME(pc.ColumnName, 1)
ORDER BY
pc.SourceTable
, pc.OrdinalPosition;
In my example above I have used INFORMATION_SCHEMA to quality-check the results.
There's some assumptions based on my exact use-case, so please check carefully what your requirements are, such as each sub-query being a simple SELECT..FROM statement without joins.