Search code examples
phpjsonstringurldecode

Part of string missing after urldecode() in php


I have an encoded string (it is too long to be posted here). When I use different utilities for decoding the string (http://www.the-art-of-web.com/javascript/escape/) the string looks perfect after urldecode(). However, when I actually pass the string through urldecode() in my php file on my testing environment the first 100 or so characters are missing. I cannot figure out why. I have tried both urldecode and rawurldecode. If you want to see the string I am trying to process you can make a GET request against this url http://pacific-wave-7885.herokuapp.com/api/opencart the string I am working with is the "contents" value of the JSON object

What i am trying to accomplish:

I want to make a php file that calls the above api address, gets the contents from the JSON object, decodes the string and parses the code.

here is what I have tried:

function utf8_urldecode($str) {
  $str = preg_replace("/%u([0-9a-f]{3,4})/i","&#x\\1;",urldecode($str));
  return html_entity_decode($str,null,'UTF-8');;
}      

$opts = array(
  'http'=>array(
    'method'=>"GET",
    'header'=>"Accept-language: en\r\n" .
              "Cookie: foo=bar\r\n"
  )
);
$context = stream_context_create($opts);

$file = file_get_contents('http://pacific-wave-7885.herokuapp.com/api/opencart', false, $context);

$contents_decode = utf8_urldecode($file);

echo $contents_decode;

You can see with this code that the "contents" starts with "language->load('shipping/usps')" and is missing the first 100 or so characters of that part of the string.

I have also tried this:

  $ch = curl_init(); 


  curl_setopt($ch, CURLOPT_URL, "http://pacific-wave-7885.herokuapp.com/api/opencart"); 

  curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1); 

  $output = curl_exec($ch); 

  $data = json_decode($output, true);

  $contents = $data[0]['contents'];

  $contents_decode = urldecode($contents);

  echo $contents_decode;

  curl_close($ch);

This also produces the same result - part of the beginning of the string is missing.

to reiterate: if you grab the encoded string straight from the JSON object and use an online decoding tool the string looks great, but once it is passed through urldecode() in my php file the first part of the string is missing characters.

If anyone can see what I am missing I would be so grateful.

Just fyi: My php environment is the latest version of XAMPP with php5 and the JSON object is coming from a NODE server with express.js.

If anyone has a better idea of how I can pass php code as a string from a Node server to a PHP server and then parse it I would be open to that as well.


Solution

  • I'm going to make two assumptions:

    1. It's showing up starting at language->load('shipping/usps');

    2. You are viewing the returned string in your browser.

    Good news is that you aren't missing any characters! The browser is simply misinterpreting the tags -

    <?php
        class ModelShippingUsps extends Model {
            public function getQuote($address) {
                $this->
    

    The browser is trying to make sense of it - it doesn't know this is PHP, it thinks it is HTML. It sees < and thinks "oh cool, beginning of an HTML tag." And then it sees ? and it just tries to figure out what kind of malformed tag this is, and then it see's -> and it decides it must be an HTML comment, so it parses it as:

    <!--?php
    class ModelShippingUsps extends Model {
        public function getQuote($address) {
            $this--->
    

    Change echo $contents_decode; to echo "<textarea>" . $contents_decode . "</textarea>" and you'll see the full string.