Search code examples
mysqlregextrailing-slashword-boundary

MySQL REGEXP with word boundary after forward slash


I have some URLs without http:// in a database table:

        url
row #1: 10.1.127.4/
row #2: 10.1.127.4/something

Now, the following filter gives me row #2 - fine:

SELECT * FROM mytable WHERE url REGEXP '[[:<:]]10.1.127.4/something[[:>:]]'

But the following filter does not give me row #1, but shouldn't it?

SELECT * FROM mytable WHERE url REGEXP '[[:<:]]10.1.127.4/[[:>:]]'

I should note that also escaping the forward slash via backslash doesn't return the wanted row #1:

SELECT * FROM mytable WHERE url REGEXP '[[:<:]]10.1.127.4\/[[:>:]]'


Solution

  • Found out that [[:>:]] expects a word character to the left, and vice versa for [[:<:]]

    Simple tests verify that:

    SELECT 'bla,,123' REGEXP '[[:<:]]bla,[[:>:]]' -- no match
    SELECT 'bla,,123' REGEXP '[[:<:]]bla[[:>:]]' -- match
    SELECT 'bla,,123' REGEXP '[[:<:]]bla,,123[[:>:]]' -- match
    

    I think this way the documentation makes sense, and I misunderstood that for quite a few years:

    [...] word boundaries. They match the beginning and end of words, [...]

    So, a word boundary expects

    1. a non-word character on the one side
    2. and a word character on the other side