I'm trying to extract text between delimiters for all available matches. The input column and the desired output are shown below:
Index | Country (input) | Country (desired output) |
---|---|---|
0 | 1, USA; 2, France; 3, Germany; | USA, France, Germany |
1 | 4, Spain; | Spain |
2 | 1, USA; 5, Italy; | USA, Italy |
I tried to use the "Extract" and "Split columns" features by using ", "
and ";"
as delimiters but it didn't work as desired. I also tried to use Text.BetweenDelimiters
and Splitter.SplitTextByEachDelimiter
but I couldn't come up with a solution.
I wanted to write a loop in Power Query that can extract this data recursively, until all countries are extracted to a new column for each row.
Any ideas? Thanks in advance!
Seems like what you are doing is splitting on semicolon, then splitting on comma, then combining the results. So lets do that
Right click the column and split on semicolon, each occurrence of the delimiter, advanced option Rows
Right click the new column and split on comma, each occurrence of the delimiter, advanced option Columns
Right click the index and group
Edit the grouping formula in the formula bar or in home..advanced editor... to replace whatever it has as a default and instead end with this, which combines all the rows using a , delimiter
, each Text.Combine([ColumnNameGoesHere]," "), type text}})
Sample code:
let Source = Excel.CurrentWorkbook(){[Name="Table1"]}[Content],
#"Split Column by Delimiter" = Table.ExpandListColumn(Table.TransformColumns(Source, {{"Country (input)", Splitter.SplitTextByDelimiter(";", QuoteStyle.Csv)}}), "Country (input)"),
#"Split Column by Delimiter1" = Table.SplitColumn(#"Split Column by Delimiter", "Country (input)", Splitter.SplitTextByDelimiter(",", QuoteStyle.Csv), {"Country (input).1", "Country (input).2"}),
#"Grouped Rows" = Table.Group(#"Split Column by Delimiter1", {"Index"}, {{"Country (desired output)", each Text.Combine([#"Country (input).2"],", "), type text}})
in #"Grouped Rows"
~ ~ ~
I assume this is simplified data, otherwise it would simpler to just remove all numbers and semicolons in a single step
let Source = Excel.CurrentWorkbook(){[Name="Table1"]}[Content],
#"Text" = Table.TransformColumns(Source,{{"Country (input)", each Text.RemoveRange(Text.Remove(_, {"1","2","3","4","5","6","7","8","9","0",";"}),0), type text}})
in #"Text"