Search code examples
regexpostgresqlurl-patterngreenplum

Issue of Regular Expression for URL pattern on POSTGRES


select regexp_replace('https://www.facebook.com/cricket/hello', '.*\..*?\/', '')

The above code is giving me

hello

instead of

cricket/hello

I checked on Regexp checking website and the pattern is correct. I am not sure where am I going wrong.

DBMS: "PostgreSQL 8.2.15 (Greenplum Database 4.2.8.3 build 1) on x86_64-unknown-linux-gnu, compiled by GCC gcc (GCC) 4.4.2 compiled on Nov 2 2014 01:33:14"


Solution

  • I don't know how, but this worked

    .*?\.[a-z]+\/
    

    Taking Andrew Wolfe's query on weirdest kinds of URLs.

    select testval, regexp_replace ( testval,  '.*?\.[a-z]+\/',  '')
    from (
        select 'https://www.facebook.com/cricket/hello' as testval
      union all
      select 'http://a.b.co.uk/cric.ke.t/hello' as testval
      union all
      select 'ftp://a.b.com.d.e.f/relroot/cricket/hello' as testval  union all
      select 'http://www.google.co.uk/cricket/hello' as testval  
      union all
      select 'http://a.b.co.uk/cricket/hello/this/is/a/little/longer?and&it=has&args' as testval
    ) vals
    

    enter image description here