Search code examples
excelpowerbipowerquerycustom-code

How to pass user parameter into string within function power query


I have the following function, which splits text into sentences using regex. Upon testing, however, these are instances where the regex doesn't quite work, and the text is wrongly split. For example, if the text contains St. bernard, I do not want this sentence to be split on the . of St.

As a nice workaround, I have modified the regex to allow for exceptions to be ignored. Please see here if you are interested in this.

I now wish to incorporate this into excel such that any user can apply their own exceptions however I am having trouble getting the user parameter to be passed into the string (regex) of the function.

Here is what I am trying to achieve (stating |Flam|Liq|St explicitly in the regex):

enter image description here

regex:

\s*((?:\b(?:[djms]rs?|flam|liq|St)\.|\b(?:[a-z]\.){2,}|\.\d[\d.]*|\.(?:com|net|org)\b|[^.?!])+(?:[.?!]+|$)) (https://regex101.com/r/nXf0TM/6)

However, what I would like to achieve is something like:

\s*((?:\b(?:[djms]rs?|"&Exceptions&")\.|\b(?:[a-z]\.){2,}|\.\d[\d.]*|\.(?:com|net|org)\b|[^.?!])+(?:[.?!]+|$))

Where Exceptions is the User parameters entered in a table in excel: flam, liq, St in each row.

i.e.:

enter image description here

M code attempting to achieve this results in an error:

Sentences From text:

let
        
        Exceptions = Exceptions,
        Source = Excel.CurrentWorkbook(){[Name="Table1"]}[Content],
        #"Changed Type" = Table.TransformColumnTypes(Source,{{"Text", type text}}),
        #"Replaced Value1" = Table.ReplaceValue(#"Changed Type","#(lf)"," ",Replacer.ReplaceText,{"Text"}),
        #"Replaced Value" = Table.ReplaceValue(#"Replaced Value1","'","&apos",Replacer.ReplaceText,{"Text"}),
        #"Invoked Custom Function" = Table.AddColumn(#"Replaced Value", "fnRegexReplace", each fnRegexReplace([Text], "\s*((?:\b(?:[djms]rs?"&Exceptions&")\.|\b(?:[a-z]\.){2,}|\.\d[\d.]*|\.(?:com|net|org)\b|[^.?!])+(?:[.?!]+|$))", "$1|")),
        #"Removed Other Columns" = Table.SelectColumns(#"Invoked Custom Function",{"fnRegexReplace"}),
        #"Split Column by Delimiter" = Table.ExpandListColumn(Table.TransformColumns(#"Removed Other Columns", {{"fnRegexReplace", Splitter.SplitTextByDelimiter("|", QuoteStyle.None), let itemType = (type nullable text) meta [Serialized.Text = true] in type {itemType}}}), "fnRegexReplace"),
        #"Filtered Rows" = Table.SelectRows(#"Split Column by Delimiter", each ([fnRegexReplace] <> ""))
    in
        #"Filtered Rows"

Exceptions:

    let
    Source = Excel.CurrentWorkbook(){[Name="Table2"]}[Content],
    #"Added Custom" = Table.AddColumn(Source, "Custom", each "Exceptions"),
    #"Grouped Rows" = Table.Group(#"Added Custom", {"Custom"}, {{"Exceptions", each Text.Combine([#"Do not split if:"],"|"), type text}})
in
    #"Grouped Rows"

fnRegexReplace

(x,y,z)=>
let 
   
   y = Text.Replace(y,"\","\\"),
   
   Source = Web.Page(
                     "<script>var x="&"'"&x&"'"&";var z="&"'"&z&
                     "'"&";var y=new RegExp('"&y&"','gmi');
                     var b=x.replace(y,z);document.write(b);</script>")
                     [Data]{0}[Children]{0}[Children]{1}[Text]{0}
in 
   Source

Error:

enter image description here

Raw text Data:

Highly Flammable Liquid Flam. H223 Liq. H334. 
St. Bernard Dog was present.
The MW of gold is 100.1. Solubility is 40mg/L.

Im sure this is an easy fix, but whatever I try, i.e. Record.FromTable{0} etc I get various errors.

If anyone could help me out, that would be great.

Thank you.


Solution

  • This is the problem line fixed.

    = Table.AddColumn(#"Replaced Value", "fnRegexReplace", each fnRegexReplace([Text], "\s*((?:\b(?:[djms]rs?"&Exceptions[Exceptions]{0}&")\.|\b(?:[a-z]\.){2,}|\.\d[\d.]*|\.(?:com|net|org)\b|[^.?!])+(?:[.?!]+|$))", "$1|"))