Search code examples
excelpowerbipowerquerym

Enumerate Text Values in Power-Query


I have a column in my table that has some text values (input) which I would like to convert to numbers (output) for each unique text value, so that I can do some regression analysis:

Input Output
AOP 1
AOS 2
AOS 2
AOS 2
AOP 1
null 0 or null
AOP 1

I initially tried to do this do this with several Transform: Replace Values steps, however I don't know how to:

  • make this flexible to different numbers of unique values (not hardcode 3 replacements but handle n where n is the number of unique values in input)
  • repeat this for many columns of my table
  • avoid looping as far as possible

What's a better approach?


Solution

  • One way is add custom column with below formula, and do that for each column you care to apply it to, using the value of each text character to generate a unique number

    = try 
     List.Accumulate(Text.ToList([Input]), "", (state, current)=>
     state&Number.ToText(Character.ToNumber(current), "0000")) otherwise null
    

    this would transform all column's text into unique numbers, replacing the original data:

    let Source = Excel.CurrentWorkbook(){[Name="Table1"]}[Content],
    Function = (x) => try  List.Accumulate(Text.ToList(x), "", (state, current)=> state&Number.ToText(Character.ToNumber(current), "0000")) otherwise null,
    TransformList = List.Transform(Table.ColumnNames(Source), each {_ , Function}),
    Output = Table.TransformColumns(Source, TransformList)
    in Output
    

    this would transform all column's text into unique numbers, appending the new columns to existing columns:

    let Source = Excel.CurrentWorkbook(){[Name="Table1"]}[Content],
    Function = (x) => try List.Accumulate(Text.ToList(x), "", (state, current)=> state&Number.ToText(Character.ToNumber(current), "0000")) otherwise null,
    TransformList = List.Transform(Table.ColumnNames(Source), each {_ , Function}),
    Output = Table.TransformColumns(Source, TransformList),
    Numericals=Table.RenameColumns( Output, List.Zip( { Table.ColumnNames( Output), List.Transform(Table.ColumnNames(Output), each _ &"number") } ) ),
    #"Merged Queries" = Table.NestedJoin(Table.AddIndexColumn(Source, "Index", 0, 1),{"Index"},Table.AddIndexColumn(Numericals, "Index2", 0, 1),{"Index2"},"Tabl2",JoinKind.LeftOuter),
    #"Expanded Tabl2" = Table.ExpandTableColumn(#"Merged Queries", "Tabl2", Table.ColumnNames( Numericals),Table.ColumnNames( Numericals)),
    #"Removed Columns" = Table.RemoveColumns(#"Expanded Tabl2",{"Index"})
    in #"Removed Columns"