Search code examples
powerquerym

Compare each column's contents with all other columns' contents and present matrix of match counts


Given this table:

enter image description here

I'd like to derive this table:

enter image description here

...sort of like a mileage chart in a map book.

I'm trying to create a cross-table comparison of the words in each of the columns, against all of the other columns' words, to show how many matches there are between them.

For instance, comparing Column 1 against Column2 might yield 4 matches. The yellow, bold outlined cells are the matches.

enter image description here

And here's how I count them:

enter image description here

I'm thinking there might be an 'easy' way to accomplish this using Power Query. Is there?

(Oh...and by the way...the solution I'm looking for should not expect a static number of input columns: i.e., it should accommodate for more columns or less columns to be used in the input comparison set.)

Thanks.


Solution

  • No, there is no easy way, but it can be done. However, I get different results. My interpretation of your logic is: for each column combination, the number of occurrences of each common word in 1 column must be multiplied with the number of occurrences in the other column. These are my results:

    enter image description here

    And this is my query code:

    let
        Source = Table1,
        ColumnNames = Table.ColumnNames(Source),
        Tabled = Table.FromColumns({ColumnNames}, type table[Columns = text]),
        AddedColumns2 = Table.AddColumn(Tabled, "Columns2", each ColumnNames, type {text}),
        ExpandedColumns2 = Table.ExpandListColumn(AddedColumns2, "Columns2"),
        CommonWords = 
            Table.AddColumn(ExpandedColumns2, 
                            "DistinctIntersect", 
                            each if [Columns] = [Columns2]
                               then {} 
                               else List.Distinct(List.Intersect({Table.Column(Source,[Columns]),
                                                                  Table.Column(Source,[Columns2])}))),
        AddedCount = 
            Table.AddColumn(CommonWords,
                            "Count", 
                            (This) => List.Sum({0}&List.Transform(This[DistinctIntersect],
                                                       each List.Count(List.PositionOf(Table.Column(Source,This[Columns]),_,2)) *
                                                            List.Count(List.PositionOf(Table.Column(Source,This[Columns2]),_,2)))),
                           Int64.Type),
        RemovedColumns = Table.RemoveColumns(AddedCount,{"DistinctIntersect"}),
        PivotedColumn = Table.Pivot(RemovedColumns, List.Distinct(RemovedColumns[Columns2]), "Columns2", "Count")
    in
        PivotedColumn