Search code examples
powerquerym

Power Query Convert Records in List to Columns


I have a list of JSON documents that I'm trying to convert into rows in Power Query. I'm struggling though as the values I need are records, in a list, inside a column the record. Anything that starts getting close to what I need gets horrendously complex 😕

A single record looks like this:

{
    "key_as_string": "2020-02-25T23:00:00.000Z",
    "key": 1582671600000,
    "doc_count": 1086187,
    "attack_types": {
        "doc_count_error_upper_bound": 0,
        "sum_other_doc_count": 0,
        "buckets": [
            {
                "key": "attack-sqli",
                "doc_count": 380989
            },
            {
                "key": "attack-protocol",
                "doc_count": 8195
            },
            {
                "key": "attack-xss",
                "doc_count": 1216
            },
            {
                "key": "attack-rce",
                "doc_count": 258
            },
            {
                "key": "attack-disclosure",
                "doc_count": 157
            },
            {
                "key": "attack-lfi",
                "doc_count": 24
            },
            {
                "key": "attack-generic",
                "doc_count": 17
            },
            {
                "key": "attack-rfi",
                "doc_count": 2
            }
        ]
    }
}

And I'm trying to turn it into this: enter image description here The 2nd row shown here is just an example of what a 2nd record would look like, for clarity.

Any help is greatly appreciated!


Solution

  • I started with this JSON, saved in a file called test.json, to ensure I would see how things worked with two records:

        [{
        "key_as_string": "2020-02-25T23:00:00.000Z",
        "key": 1582671600000,
        "doc_count": 1086187,
        "attack_types": {
            "doc_count_error_upper_bound": 0,
            "sum_other_doc_count": 0,
            "buckets": [
                {
                    "key": "attack-sqli",
                    "doc_count": 380989
                },
                {
                    "key": "attack-protocol",
                    "doc_count": 8195
                },
                {
                    "key": "attack-xss",
                    "doc_count": 1216
                },
                {
                    "key": "attack-rce",
                    "doc_count": 258
                },
                {
                    "key": "attack-disclosure",
                    "doc_count": 157
                },
                {
                    "key": "attack-lfi",
                    "doc_count": 24
                },
                {
                    "key": "attack-generic",
                    "doc_count": 17
                },
                {
                    "key": "attack-rfi",
                    "doc_count": 2
                }
            ]
        }
    },
    {
        "key_as_string": "2020-02-25T22:00:00.000Z",
        "key": 158267000000,
        "doc_count": 1086186,
        "attack_types": {
            "doc_count_error_upper_bound": 0,
            "sum_other_doc_count": 0,
            "buckets": [
                {
                    "key": "attack-sqli",
                    "doc_count": 384419
                },
                {
                    "key": "attack-protocol",
                    "doc_count": 2046
                },
                {
                    "key": "attack-xss",
                    "doc_count": 1504
                },
                {
                    "key": "attack-rce",
                    "doc_count": 198
                },
                {
                    "key": "attack-disclosure",
                    "doc_count": 120
                },
                {
                    "key": "attack-lfi",
                    "doc_count": 16
                },
                {
                    "key": "attack-generic",
                    "doc_count": 200
                },
                {
                    "key": "attack-rfi",
                    "doc_count": 2
                }
            ]
        }
    }]
    

    The I used the GUI to derive this M code, which seems to work:

    let
        Source = Json.Document(File.Contents("MYFILEPATH\test.json")),
        #"Converted to Table" = Table.FromList(Source, Splitter.SplitByNothing(), null, null, ExtraValues.Error),
        #"Expanded Column1" = Table.ExpandRecordColumn(#"Converted to Table", "Column1", {"key_as_string", "key", "doc_count", "attack_types"}, {"key_as_string", "key", "doc_count", "attack_types"}),
        #"Expanded attack_types" = Table.ExpandRecordColumn(#"Expanded Column1", "attack_types", {"doc_count_error_upper_bound", "sum_other_doc_count", "buckets"}, {"doc_count_error_upper_bound", "sum_other_doc_count", "buckets"}),
        #"Expanded buckets" = Table.ExpandListColumn(#"Expanded attack_types", "buckets"),
        #"Expanded buckets1" = Table.ExpandRecordColumn(#"Expanded buckets", "buckets", {"key", "doc_count"}, {"key.1", "doc_count.1"}),
        #"Pivoted Column" = Table.Pivot(#"Expanded buckets1", List.Distinct(#"Expanded buckets1"[key.1]), "key.1", "doc_count.1"),
        #"Removed Other Columns" = Table.SelectColumns(#"Pivoted Column",{"key", "attack-sqli", "attack-protocol", "attack-xss", "attack-rce", "attack-disclosure", "attack-lfi", "attack-generic", "attack-rfi"}),
        #"Sorted Rows" = Table.Sort(#"Removed Other Columns",{{"key", Order.Descending}})
    in
        #"Sorted Rows"
    

    Just cut and paste the above M code into your Advanced editor. Replace MYFILEPATH with your file path.

    I got this result:

    enter image description here