Search code examples
phpregexparsingquery-stringampersand

PHP Regex Parse query string containing un-encoded ampersands


I'm receiving a query string (from a terrible payment system whose name I do not wish to sully publicly) that contains un-encoded ampersands

name=joe+jones&company=abercrombie&fitch&other=no

parse_str can't handle this, and I don't know enough of regex to come up with my own scheme (though I did try). My hang up was look-ahead regex which I did not quite understand.

What I'm looking for:

Array
(
    [name] => joe jones
    [company] => abercrombie&fitch
    [other] => no
)

I thought about traipsing through the string, ampersand by ampersand, but that just seemed silly. Help?


Solution

  • How about this:

    If two ampersands occur with no = between them, encode the first one. Then pass the result to the normal query string parser.

    That should accomplish your task. This works because the pattern for a "normal" query string should always alternate equals signs and ampersands; thus two ampersands in a row means one of them should have been encoded, and as long as keys don't have ampersands in them, the last ampersand in a row is always the "real" ampersand preceding a new key.

    You should be able to use the following regex to do the encoding:

    $better_qs = preg_replace("/&(?=[^=]*&)/", "%26", $bad_qs);