Search code examples
excelpowerquerym

Splitting a Column into Rows in Power Query using Line Feed followed by Anything but a Space as a Delimiter


In Excel's Power Query tool, I need to do the following:

Input Data

Data
foo1.SUB1
foo1.SUB1.1
foo1.SUB1.2
foo1.SUB2

Output Data

Data
foo1.SUB1
foo1.SUB1.1
foo1.SUB1.2
foo1.SUB2

In other words, I need to split a column into rows where the delimiter is a line feed (represented by #(lf) in Power Query) followed by any character other than a space.

If Power Query accepted regular expressions in a delimiter, I would use #(lf)[^ ].

In other words, I'd like to create this applied step:

Split Column into Rows

However, Power Query does not understand regular expressions.

Is there a way I can accomplish this?


Solution

  • One way in powerquery to cheat it is to split on line break, check first character of each row for a space, then recombine

    let Source = Excel.CurrentWorkbook(){[Name="Table1"]}[Content],
    #"Split Column by Delimiter" = Table.ExpandListColumn(Table.TransformColumns(Source, {{"Column1", Splitter.SplitTextByDelimiter("#(lf)", QuoteStyle.Csv), let itemType = (type nullable text) meta [Serialized.Text = true] in type {itemType}}}), "Column1"),
    #"Added Index" = Table.AddIndexColumn(#"Split Column by Delimiter", "Index", 0, 1, Int64.Type),
    #"Added Custom" = Table.AddColumn(#"Added Index", "Custom", each if Text.Start([Column1],1)<>" " then [Index] else null),
    #"Filled Down" = Table.FillDown(#"Added Custom",{"Custom"}),
    #"Grouped Rows" = Table.Group(#"Filled Down", {"Custom"}, {{"New", each Text.Combine(List.Transform([Column1], Text.From), "#(lf)"), type text}}),
    #"Removed Columns" = Table.RemoveColumns(#"Grouped Rows",{"Custom"})
    in #"Removed Columns"
    

    when pasting back into excel remember to enable Wrap Text

    enter image description here

    You can actually use regular expressions in powerquery but they are a pain, slow, and dont seem to be needed here