Search code examples
regexmatomo

Regex to exclude results after dynamic part of URL


I am setting heatmap tracking rules for Matomo analytics platform. The part of my URL will always be dynamic and contain only capital letters, for example - http://example.com/ASDFG. I can address this with http:\/\/example\.com\/[A-Z]+.

But it gets trickier to track the subdirectories, for example, http://example.com/ASDFG/page1 .

http:\/\/example\.com\/[A-Z]+ will not only track http://example.com/ASDFG but also http://example.com/ASDFG/page1 .

Ideally, I am thinking of having two separate expressions:

No need to match the www parameter as it gets prepended by the analytics platform automatically.

What would be the best way to write these two expressions?


Solution

  • I'm guessing that maybe,

    ^https?:\/\/example\.com\/[A-Z]+\/?$
    ^http:\/\/example\.com\/[A-Z]+\/?$
    

    or without final slashes,

    ^https?:\/\/example\.com\/[A-Z]+$
    ^http:\/\/example\.com\/[A-Z]+$
    

    might be desired for the first one.

    Demo 1


    For the second one, it would be as simple as,

    ^https?:\/\/example\.com\/[A-Z]+\/(?:page1|page2|page3)\/?$
    ^http:\/\/example\.com\/[A-Z]+\/(?:page1|page2|page3)\/?$
    

    for multiple pages, and

    ^https?:\/\/example\.com\/[A-Z]+\/page1\/?$
    ^https?:\/\/example\.com\/[A-Z]+\/page2\/?$
    ^https?:\/\/example\.com\/[A-Z]+\/page3\/?$
    

    for one by one page.

    Demo 2

    You can also remove the \/? at the end, if it'd be unnecessary, and similarly s? if https is not required.


    If you wish to simplify/modify/explore the expression, it's been explained on the top right panel of regex101.com. If you'd like, you can also watch in this link, how it would match against some sample inputs.


    RegEx Circuit

    jex.im visualizes regular expressions:

    enter image description here