Search code examples
apacheapache-modules

Apache module request_rec->args can't handle url encoded entities


In my ap_hook_handler I'm experiencing strange behavior with request_rec->args when portions of the query string contain url encoded entities.

Here are my findings:

Scenario #1: encode the first 'e' in the query string:

http://localhost/test?group=%65mployees

Result:
r->uri: /test
r->args: "group=           %mployees" (observe the many spaces)

Scenario #2: encode the second 'e':

http://localhost/test?group=employ%65es

Result:
r->uri: /test
r->args: "group=employ          0.000000e-01s"

Scenario #3: encode the last 'e':

http://localhost/test?group=employe%65s

Result: seg fault

When I url encode any portion of the path (not the query string) Apache behaves:

Scenario #4: encode 'e' in the path instead of query string:

http://localhost/t%65st

Result:
r->uri: /test (expected)
r->args: NULL (expected)

Why do 'args' and 'uri' handle url encoding differently and how can I get the canonicalized query string within my module, like I can with 'request_rec->uri' ?


Solution

  • The reason why I was getting strange results in my logs is because I was passing the query string as an argument to printf and the percentage symbol is a special character.

    I am now reconstructing the url, passing this to the ap_unescape_url function to decode the url.

    Now that I think of it, it makes sense that Apache doesn't automatically decode the args parameter because this is essentially the "data" portion of the url and not path related.