Search code examples
regexnemerle

syntax error in Regular expression to match link url


I have the following method in some nemerle code:

private static getLinks(text : string) : array[string] {
        def linkrx = Regex(@"<a\shref=['|\"](.*?)['|\"].*?>");
        def m = linkrx.Matches(text);
        mutable txmatches : array[string];
        for (mutable i = 0; i < m.Count; ++i) {
            txmatches[i] = m[i].Value;
        }
        txmatches
    }

the problem is that the compiler for some reason is trying to parse the brackets inside the regex statement and its causing the program to not compile. If i remove the @, (which i was told to put there) i get an invalid escape character error on the "\s"

Heres the compiler output:

NCrawler.n:23:21:23:22: ←[01;31merror←[0m: when parsing this `(' brace group
NCrawler.n:23:38:23:39: ←[01;31merror←[0m: unexpected closing bracket `]'
NCrawler.n:22:57:22:58: ←[01;31merror←[0m: when parsing this `{' brace group
NCrawler.n:23:38:23:39: ←[01;31merror←[0m: unexpected closing bracket `]'
NCrawler.n:8:1:8:2: ←[01;31merror←[0m: when parsing this `{' brace group
NCrawler.n:23:38:23:39: ←[01;31merror←[0m: unexpected closing bracket `]'
NCrawler.n:23:38:23:39: ←[01;31merror←[0m: unexpected closing bracket `]'

(line 23 is the line with the regex code on it)

What should I do?


Solution

  • I don't know Nemerle, but it seems like using @ disables all escapes, including the escape for the ".

    Try one of these:

    def linkrx = Regex("<a\\shref=['\"](.*?)['\"].*?>");
    
    def linkrx = Regex(@"<a\shref=['""](.*?)['""].*?>");
    
    def linkrx = Regex(@"<a\shref=['\x22](.*?)['\x22].*?>");