I am importing a lot of tables from pdf-files into Excel by Powerquery - which works pretty well.
Beside several other migrations I have the following task which I am not able to solve:
In some cases - esp. after page breaks - single values that should go into single cells (one below the other) are placed into one cell joined by linebreaks and below cells are empty.
I need to split the values of such a cell (cell-content contains line-breaks) and put 2nd to n value into the according empty cells below this cell.
(It's kind of a "splitted drill-down" ...)
I am pretty new to M (not to VBA or programming) but I am not able to find a working solution.
This is difficult to do robustly but you can expand using Text.Split
on the line feed delimiter as @horseyride suggests and remove the blank rows on that second column and then smash the columns back together with Table.FromColumns
.
Here's an example you can paste into the Advanced Editor:
let
Source = Table.FromRows(Json.Document(Binary.Decompress(Binary.FromText("i45WMlTSUUoEYkMDpVgdCDcJiI0gXCMgMxkkawrnpsTkpcbkpYEEjeCCIJ4FMs8IImcMZKaDJI3h3AwQ11wpNhYA", BinaryEncoding.Base64), Compression.Deflate)), let _t = ((type nullable text) meta [Serialized.Text = true]) in type table [Week = _t, A = _t, B = _t]),
TransformA = List.Select(List.Combine(List.Transform(Source[A], each Text.Split(_, "#(lf)"))), each Text.Length(_) > 0),
FromCols = Table.FromColumns({Source[Week], TransformA, Source[B]}, {"Week", "A", "B"})
in
FromCols
This takes a starting table like this:
Transforms the A
column as a list, splitting each element on the line feed character, combining each result back together, and filtering out null and empty strings:
The final step takes columns Week
and B
from the original table and sticks the transformed column A
in the middle:
You'll run into trouble if the number of extra expanded rows doesn't exactly match the number of blank rows removed but this should work under the assumption that they do match.