Someone migrated a DB2 database to SQL Server and during the migration all PK and FK were lost. there is no way to have them back.
But thanks to this query I'm now able to Reverse Engineer the database diagram based on the datatype
and column name
. (I know, is a Reverse Engineer based on assumptions).
I now would like to create a Power BI dashboard with the Chord visual and create a link between tables that have the same column name. (the image is just an example to give you an idea)
But I cannot use From
and To
because I'm not tracking a change.
Maybe I'm using the wrong visual?
How to track things that are in common?
EDIT FOR BOUNTY:
If I run the query against Adventure Works:
WITH ColumnCount AS(
SELECT s.name AS SchemaName,
t.name AS TableName,
c.name AS ColumnName,
ct.[name] AS DataType,
c.max_length,
c.precision,
c.scale,
COUNT(c.column_id) OVER (PARTITION BY c.[name], ct.[name], c.max_length, c.precision, c.scale) AS Duplicates
FROM sys.schemas s
JOIN sys.tables t ON s.schema_id = t.schema_id
JOIN sys.columns c ON t.object_id = c.object_id
JOIN sys.types ct ON c.user_type_id = ct.user_type_id)
SELECT *
FROM ColumnCount CC
WHERE CC.Duplicates > 1
ORDER BY CC.ColumnName,
CC.SchemaName,
CC.TableName;
The query can group columns that have the same: ColumnName
, DataType
, max_length
, precision
, scale
.
But how to render this in a Power BI Chord visual?
The goal is ti find link between tables.
Chord seems the best visual to archive it but if you have better visual to suggest I'm open to your tips.
To relate the Power BI Chord visual, you need a minimum of two data points and a measure;
Depending on the number of tables, you may find the Chord chart a little overwhelmed by data, however if you bring those three values through form Adventure Works, you can produce
The dataset you are starting with has a lot more info, and rows than you need, but doesn't naturally contain the relationship. By simplifying the dataset with the following, you can create your Chord data point
WITH ColumnCount AS(
SELECT t.name AS TableName,
c.name AS ColumnName,
COUNT(c.column_id) OVER (PARTITION BY c.[name], c.max_length, c.precision, c.scale) AS Duplicates
FROM sys.tables t
JOIN sys.columns c ON t.object_id = c.object_id
WHERE c.name like '%Id'
AND c.name != 'rowguid'
)
SELECT TableName ForeignTableName,
LEFT(ColumnName, LEN(ColumnName)-2) PrimaryTableName,
1 Relationship
FROM ColumnCount CC
WHERE CC.Duplicates > 1
AND LEFT(ColumnName, LEN(ColumnName)-2) != TableName
ORDER BY PrimaryTableName,
CC.ColumnName,
CC.TableName
I have made a few assumptions in the above for simplicity.
All foreign keys end in ID (to stop incorrect key matches like ActualCost) with
WHERE c.name like '%Id'
Rowguid is not a relationship column and is therefore excluded
AND c.name != 'rowguid'
We don't want a primary key table relationship to itself
LEFT(ColumnName, LEN(ColumnName)-2) != TableName
Scheme is not important (and therefore removed)
The query will give you a result set like follows;
Then to create the chart, simply add the From, To and Values count with your PrimaryTableName, ForeignTableName and Relationship (count) values as follows
Giving you the required Power BI Chord relationship chart
As per your comments, if you want to do the same, but use the field names to match, (not assuming name[id]) to denote your fk / pk, then the following would work. I have included a section to force the primary table, based on the Primary Key Check Constraint, that you can comment out if you want to show all relationships, but be warned you will have thousands and they will be bi directional if you don't have a way to identify a primary key/table!
WITH ColumnCount AS(
SELECT s.name AS SchemaName,
t.name AS TableName,
c.name AS ColumnName,
ct.[name] AS DataType,
c.max_length,
c.precision,
c.scale,
COUNT(c.column_id) OVER (PARTITION BY c.[name], ct.[name], c.max_length, c.precision, c.scale) AS Duplicates
FROM sys.schemas s
JOIN sys.tables t ON s.schema_id = t.schema_id
JOIN sys.columns c ON t.object_id = c.object_id
JOIN sys.types ct ON c.user_type_id = ct.user_type_id
)
SELECT cc.TableName + '.' + cc.ColumnName ForeignTableName, cd.TableName + '.' + cd.ColumnName PrimaryTableName, 1 Relationship --,cc.ColumnName, cc.DataType, cc.Duplicates
FROM ColumnCount CC
/*pk only joins - take ths out if you want all joins between fields, not just pk to fk */
INNER JOIN INFORMATION_SCHEMA.CONSTRAINT_COLUMN_USAGE Col on col.COLUMN_NAME = cc.ColumnName
AND col.TABLE_NAME = cc.TableName
INNER JOIN INFORMATION_SCHEMA.TABLE_CONSTRAINTS Tab on Col.Constraint_Name = Tab.Constraint_Name
AND Col.Table_Name = Tab.Table_Name
AND Constraint_Type = 'PRIMARY KEY'
/*end of pk only joins */
LEFT JOIN ColumnCount Cd on cc.ColumnName = cd.ColumnName
and cc.DataType= cd.DataType
and cc.TableName != cd.TableName
WHERE CC.Duplicates > 1
ORDER BY CC.ColumnName,
CC.SchemaName,
CC.TableName;
Plugged into the same structure, this provides a Chord diagram of