In my ap_hook_handler I'm experiencing strange behavior with request_rec->args when portions of the query string contain url encoded entities.
Here are my findings:
Scenario #1: encode the first 'e' in the query string:
Result:
r->uri: /test
r->args: "group= %mployees" (observe the many spaces)
Scenario #2: encode the second 'e':
Result:
r->uri: /test
r->args: "group=employ 0.000000e-01s"
Scenario #3: encode the last 'e':
Result: seg fault
When I url encode any portion of the path (not the query string) Apache behaves:
Scenario #4: encode 'e' in the path instead of query string:
Result:
r->uri: /test (expected)
r->args: NULL (expected)
Why do 'args' and 'uri' handle url encoding differently and how can I get the canonicalized query string within my module, like I can with 'request_rec->uri' ?
The reason why I was getting strange results in my logs is because I was passing the query string as an argument to printf and the percentage symbol is a special character.
I am now reconstructing the url, passing this to the ap_unescape_url function to decode the url.
Now that I think of it, it makes sense that Apache doesn't automatically decode the args parameter because this is essentially the "data" portion of the url and not path related.