Search code examples
regexurljsessionid

Extract last part of url without query string or jsessionid


I want a regex that will always return the last part of an url before the query string parameters and without the jessionid if present.

Here's some url examples:

http://www.somesite.com/some/path/test.action;jsessionid=000063vCmvJAn7VWyymA_dPsHZs:16u9pglit?sort=2&param1=1&param2=2
http://www.somesite.com/some/path/test;jsessionid=000063vCmvJAn7VWyymA_dPsHZs:16u9pglit?sort=2&param1=1&param2=2
http://www.somesite.com/some/path/test.action?sort=2&param1=1&param2=2
http://www.somesite.com/some/path/test?sort=2&param1=1&param2=2

Here's my regex so far:

.*http://.*/some/path.*/(.*);?.*\?.*

It is working for the url that does not contain jsessionid, but will return test;jessionid=... if it is present.

To test: http://regex101.com/r/fM0mE2


Solution

  • I would use this regex:

    .*http:\/\/.*\/some\/path.*\/([^;\?]+);?.*\?.*
                                  ^^^^^^ 
    

    Basically matches anything that isn't ; or ?. And I think it might be shortened to:

    .*http:\/\/.*\/some\/path.*\/([^;\?]+)