Search code examples
powerbipowerquery

How to generate data dictionary from Power BI report/dataset?


Assuming Power BI dataset has columns with description added in the modelling view, then is there a way to generate data dictionary from Power BI report/dataset?


Solution

  • You can document using Power Query and Excel. Change your .pbix to a .pbit file and then run the following query to extract a full data dictionary. This is not my code and comes courtesy of RacketLuncher on Reddit.

    let
        Source = fUnzip(File.Contents("C:\Users\Your file.pbit")),
        Filter_DataModelSchema = Table.SelectRows(Source, each ([FileName] = "DataModelSchema" and [Attributes]?[Hidden]? <> true)),
        JSONFile = Json.Document(Filter_DataModelSchema{0}[Content],1200),
        model = JSONFile[model], // Start from here to change whether you want to pick Tables, or Relationships, or other parts of the model.    
        tables = model[tables], // Here we are picking the tables metadata
        #"Converted to Table" = Table.FromList(tables, Splitter.SplitByNothing(), null, null, ExtraValues.Error),
        #"Expanded Column1" = Table.ExpandRecordColumn(#"Converted to Table", "Column1", {"name", "lineageTag", "modifiedTime", "structureModifiedTime", "columns", "partitions", "measures", "annotations"}, {"name", "lineageTag", "modifiedTime", "structureModifiedTime", "columns", "partitions", "measures", "annotations"}),
        #"Expanded columns" = Table.ExpandListColumn(#"Expanded Column1", "columns"),
        #"Expanded columns1" = Table.ExpandRecordColumn(#"Expanded columns", "columns", {"name", "dataType", "isKey", "sourceColumn", "formatString", "lineageTag", "summarizeBy", "annotations", "sortByColumn"}, {"columns.name", "columns.dataType", "columns.isKey", "columns.sourceColumn", "columns.formatString", "columns.lineageTag", "columns.summarizeBy", "columns.annotations", "columns.sortByColumn"}),
        #"Expanded measures" = Table.ExpandListColumn(#"Expanded columns1", "measures"),
        #"Expanded measures1" = Table.ExpandRecordColumn(#"Expanded measures", "measures", {"name", "expression", "formatString", "displayFolder", "lineageTag", "annotations"}, {"measures.name", "measures.expression", "measures.formatString", "measures.displayFolder", "measures.lineageTag", "measures.annotations"})
    in
        #"Expanded measures1"