I have a custom proxy servlet that has to deal with URL-s that contain special characters (e.g. ; , . /
in their) in their path. This is because it is a RESTful application that has ugly path params by design. (Don't comment it as it is not mine.)
My client, (actually wget
, because browsers tend to show unescaped the URL) send a request to this URL:
http://localhost:8080/MyApplication/proxy/foo/ugly%3Apart%2Fcomes%3Bhere/children
//note: %2F = '/', %3A = ':', %3B = ';'
In my servlet (mapped to /proxy/*
) when I try to forward the GET request, I am unable to reconstruct it because HttpRequest.getPathInfo()
returns me the URL unescaped:
http://localhost:8080/MyApplication/proxy/foo/ugly:part/comes;here/children
And therefore the information of which /
s and ;
s were originally escaped or unescaped is lost. And that makes a difference for me, for example ; makes my URL a so called matrix URL, see http://www.w3.org/DesignIssues/MatrixURIs.html, or all the REST path parameters get shifted by slashes.
Actually I found this issue on a Glassfish server, so I'm not sure if different application servers treat this differently or not. I found only this in the Servlet API:
getPathInfo() Returns any extra path information associated with the URL the client sent when it made this request.
How could I get the original, unescaped request URL that was sent by the client?
Have a look at HttpServletRequest
's getRequestURI()
and getRequestURL()
methods.
If you need to remove context and servlet mappings, look at getContextPath()
and getServletPath()
.