Search code examples
phpphp-curl

Is there any way to urlencode/escape while Curl is Following Location?


I am doing Api integration of SEB open banking while Curl follow location it is not encoding the url as normal browser do.

        $url = 'https://api-sandbox.sebgroup.com/mga/sps/oauth/oauth20/authorize?' . 'client_id=XXXXXXXXXXXX&response_type=code&scope=psd2_accounts%20psd2_payments&redirect_uri=https://testcallback.com/test';
        curl_setopt_array($curl, array(
            CURLOPT_HEADER => true,
            CURLOPT_URL => $url,
            CURLOPT_RETURNTRANSFER => true,
            CURLOPT_MAXREDIRS => 10,
            CURLOPT_TIMEOUT => 30,
            CURLOPT_FOLLOWLOCATION => true,
            CURLOPT_CUSTOMREQUEST => "GET",
            CURLOPT_VERBOSE => true,
            CURLOPT_HTTPHEADER => array(
                "accept: text/html",
            ),
        ));

        $response = curl_exec($curl);
        $err = curl_error($curl);

Here is the curl verbose from logs

< HTTP/1.1 302 Found
< content-language: en-US
< date: Thu, 25 Jul 2019 21:15:49 GMT
< location: https://api-sandbox.sebgroup.com/mga/sps/authsvc?PolicyId=urn:ibm:security:authentication:asf:username_login&client_id=XXXXXXXXXXXX&response_type=code&scope=psd2_accounts psd2_payments&redirect_uri=https://testcallback.com/test&state=undefined
< p3p: CP="NON CUR OTPi OUR NOR UNI"
< x-frame-options: SAMEORIGIN
< Strict-Transport-Security: max-age=15552000; includeSubDomains
< Transfer-Encoding: chunked
< 
* Ignoring the response-body
* Connection #0 to host api-sandbox.sebgroup.com left intact
* Issue another request to this URL: 'https://api-sandbox.sebgroup.com/mga/sps/authsvc?PolicyId=urn:ibm:security:authentication:asf:username_login&client_id=XXXXXXXXXXXX&response_type=code&scope=psd2_accounts psd2_payments&redirect_uri=https://testcallback.com/test&state=undefined'
* Expire in 30000 ms for 8 (transfer 0x5572d97d6ad0)
* Found bundle for host api-sandbox.sebgroup.com: 0x5572d97744f0 [can pipeline]
* Could pipeline, but not asked to!
* Re-using existing connection! (#0) with host api-sandbox.sebgroup.com
* Connected to api-sandbox.sebgroup.com (129.178.54.70) port 443 (#0)
* Expire in 0 ms for 6 (transfer 0x5572d97d6ad0)
> GET /mga/sps/authsvc?PolicyId=urn:ibm:security:authentication:asf:username_login&client_id=XXXXXXXXXXXX&response_type=code&scope=**psd2_accounts psd2_payments**&redirect_uri=https://testcallback.com/test&state=undefined HTTP/1.1
Host: api-sandbox.sebgroup.com
Cookie: AMWEBJCT!%2Fmga!JSESSIONID=00009xuAYPuCp9GW43jcmC-CafK:f218d509-b31a-4e85-82f3-4026c87d2a41; TS01edf909=0107224bed281ed0132bcd33d1abd742777866cf59ada955adfb4e11b262eec4177bcfece6d5008e34b56a7ab37f409ab22798b97dd781fcdbe67b1d85c3acb10a1c21f2ca; TS01ef558a=0107224bed32bbf99c1c620e086bb40f0577a7d1fcada955adfb4e11b262eec4177bcfece69dd83308b2725dc487ace1c823d15bd6e2e5d0d2968f3683570ed32b96ea5da2; C0WNET=03758b02-5d3a-4321-a19f-1c022988e2f4
accept: text/html

< HTTP/1.1 400 Bad Request
< Cache-Control: no-cache
< Connection: close
< Content-Type: text/html; charset=utf-8
< Pragma: no-cache
< Content-Length: 246
< 
* Closing connection 0

This Follow location contains a space betweeb (psd2_accounts psd2_payments). Which is not being converted into %20

/mga/sps/authsvc?PolicyId=urn:ibm:security:authentication:asf:username_login&client_id=XXXXXXXXXXXX&response_type=code&scope=**psd2_accounts psd2_payments**&redirect_uri=https://testcallback.com/test&state=undefined

How can I encode the follow location parameters as well so that the above url automatically become

/mga/sps/authsvc?PolicyId=urn:ibm:security:authentication:asf:username_login&client_id=XXXXXXXXXXXX&response_type=code&scope=**psd2_accounts%20psd2_payments**&redirect_uri=https://testcallback.com/test&state=undefined

Solution

  • URLs are per definition URL encoded already. Otherwise it is not a URL. HTTP redirects should by definition redirect to URLs and they MUST be URL encoded already. Not doing so is a violation of the HTTP spec (Source). The api-sandbox.sebgroup.com website is not returning a real URL in their redirect. Maybe you should consider contacting them and notifying them of this problem since cURL is a pretty common way to access an API.

    If they can't get this fixed in a timely manner, I wouldn't recommend just url-encoding the Location header because they might fix it in the future and then you will be double-encoding the URL, which is also obviously wrong. You need to urlencode it only if it is invalid.

    Therefore, what I suggest is to remove the CURLOPT_FOLLOWLOCATION option to make sure that it doesn't follow the redirects and add a CURLOPT_HEADERFUNCTION, which will be called by curl for each header received, urlencode the Location header, only if present and invalid, and then execute curl in a loop until there is no Location header. Since spaces in URLs violate the spec, PHP's filter_var() function properly considers it to be invalid.

    $url = 'https://example.com';
    
    $ch = curl_init();
    curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1);
    
    // this function is called by curl for each header received
    curl_setopt($ch, CURLOPT_HEADERFUNCTION,
        function($curl, $header) use (&$headers) {
            $len = strlen($header);
            $header = explode(':', $header, 2);
            if (count($header) < 2) {
                // ignore invalid headers
                return $len;
            }
    
            $name = strtolower(trim($header[0]));
    
            if ($name == 'location' && !filter_var(trim($header[1]), FILTER_VALIDATE_URL)) {
                $header[1] = urlencode(trim($header[1]));
            }
    
            $headers[$name][] = trim($header[1]);
    
            return $len;
        }
    );
    
    // Maximum number of redirects
    $max_iterations = 10;
    $iterations = 0;
    
    do {
        $url = $headers['location'][0] ?? $url;
        $headers = [];
        curl_setopt($ch, CURLOPT_URL, $url);
        $data = curl_exec($ch);
        print_r($headers);
    
    } while (isset($headers['location']) && ++$iterations < $max_iterations);