Search code examples
phpcsvencodingascii

PHP: string does not equal string


The Project:

I'm parsing a CSV file and am inserting its data into a database. In a loop I take the CSV values and validate them (make sure they're populated), and the values are stored like this:

$current_order_number = $array[$i]['order_number'] ?? null;

("order_number" is the name of a header inside the CSV file.)

The Problem:

When I manually typed "order_number", $current_order_number was null.

When I copied "order_number" from the CSV file $current_order_number was correct.

(This is the only CSV header that does this. I manually typed the rest and they work perfectly.)

Debugging:

I used a string-to-ASCII converter and found that the correct string ("order_number") converted into:

NULL 111 114 100 101 114 95 110 117 109 98 101 114

While the incorrect string ("order_number") converted into

111 114 100 101 114 95 110 117 109 98 101 114

I tried (to remove the null):

$value = str_replace("\0", "", $value);

And (thinking it was the encoding):

$value = utf8_encode($value);

with no luck.

Full Code:

$target_path = $target_path . basename( $_FILES['csv_raw']['name'] );

if ( move_uploaded_file($_FILES['csv_raw']['tmp_name'], $target_path) ) {
  echo "File Upload Successful: -- [" . $target_path . "] --\r\n";
  $array = $fields = array();
  $i = 0;
  $handle = @fopen($target_path, "r");
  if ($handle) {
    while (($row = fgetcsv($handle, 4096)) !== false) {
      if (empty($fields)) {
        $fields = $row;
        continue;
      }
      foreach ($row as $k=>$value) {
        if (strlen(trim($value)) == 0) {
          $value = 0;
        }
        $value = str_replace("\0", "", $value);
        $array[$i][$fields[$k]] = $value;
      }
      //print_r($array[$i]);
      //"order_number"(correct [NULL 111 114 100 101 114 95 110 117 109 98 101 114])
      //"order_number" (incorrect [111 114 100 101 114 95 110 117 109 98 101 114])
      $current_order_number = $array[$i]['order_number'] ?? null;
      $current_order_line = $array[$i]['order_line'] ?? null;
      $current_shipment_number = $array[$i]['shipment_number'] ?? null;
      $current_package_number = $array[$i]['package_number'] ?? null;
      $current_package_item_number = $array[$i]['package_item_num'] ?? null;
      $current_item_code = $array[$i]['item_code'] ?? null;
      $current_quantity = $array[$i]['qty_shipped'] ?? null;
      $query = query("SELECT * FROM `osl` WHERE `order_number` = ".$current_order_number." AND `item_code` = ".$current_item_code." ");
      confirm($query);

      if (mysqli_num_rows($query) == 0) {
        $query = query("SELECT * FROM `osl_daily` WHERE `order_number` = ".$current_order_number." AND `item_code` = ".$current_item_code." ");
        confirm($query);
        if (mysqli_num_rows($query) == 0) {
          $query = query("
          INSERT INTO `osl_daily` (`local_id`, `order_number`, `order_line`, `shipment_number`, `package_number`, `package_item_number`,`item_code`, `quantity`)
          VALUES (NULL,".$current_order_number.",".$current_order_line.",".$current_shipment_number.",".$current_package_number.",".$current_package_item_number.",".$current_item_code.",".$current_quantity.")
          ");
          confirm($query);
          echo "Success: Order Number ".$current_order_number." Lines Have Been Imported Successfully.\r\n";
        }
        else {
          echo "Error: Order Number ".$current_order_number." Line Already Exists in osl_daily.\r\n";
        }
      }
      else {
        echo "Error: Order Number ".$current_order_number." Line Already Exists in osl.\r\n";
      }

      $i++;
    }
    if (!feof($handle)) {
      echo "Error: unexpected fgets() fail\r\n";
    }
    fclose($handle);
  }
}
else {
  echo "Error: File Upload Failed\r\n";
}

CSV Code: order_number,order_line,shipment_number,package_number,package_item_num,item_code,qty_shipped 43441,1,37294,1,1,10000,1 43441,2,37294,1,2,10010,1

Why doesn't "order_number" = "order_number"? And how do I make order_number = order_number?


Solution

  • The first three characters seem to be the problem. You have characters 239, 187 and 191 at the beginning of the file. Having a look for this combination of codes seems to give a byte order mark (BOM) for UTF-8.

    The easy way of dealing with this would be to just remove them off the first field if present...

    if (empty($fields)) {
        if ( substr($row[0],0,3) === chr(239).chr(187).chr(191))    {
            $row[0] = substr($row[0], 3);
        }
        $fields = $row;
        continue;
    }