As this solution was adapted from a well-voted answer elsewhere, I was not expecting to encounter problems.
Problem: I want to LEFT JOIN
file0.csv with file1.csv.
file0.csv
+-----------------+-------------+--------------+
| Manufacturer ID | image | description |
+-----------------+-------------+--------------+
| SKU231 | image1.jpg | A box. |
| SKUAG1 | image22.jpg | Another box. |
| SKU21D | image7a.png | A third box. |
+-----------------+-------------+--------------+
file1.csv:
+--------+--------+--------+-------+-------+
| mpn | length | height | width | title |
+--------+--------+--------+-------+-------+
| SKU231 | 22 | 14 | 10 | Box 1 |
| SKUAG1 | 12 | 6 | 6 | Box 2 |
| SKU21D | 5 | 8 | 5 | Box 3 |
+--------+--------+--------+-------+-------+
Desired result (file2.csv):
+-----------------+-------------+--------------+--------+--------+--------+-------+-------+
| Manufacturer ID | image | description | mpn | length | height | width | title |
+-----------------+-------------+--------------+--------+--------+--------+-------+-------+
| SKU231 | image1.jpg | A box. | SKU231 | 22 | 14 | 10 | Box 1 |
| SKUAG1 | image22.jpg | Another box. | SKUAG1 | 12 | 6 | 6 | Box 2 |
| SKU21D | image7a.png | A third box. | SKU21D | 5 | 8 | 5 | Box 3 |
+-----------------+-------------+--------------+--------+--------+--------+-------+-------+
PHP function to LEFT JOIN
file0.csv
with file1.csv
on Manufacturer ID
and mpn
respectively:
function my_csv_join(array $csv_input_file_array, $csv_output_file, $html_preview, $left_join_on, $right_join_on = NULL) {
if (count($csv_input_file_array) > 2) {
echo 'This function probably only works for 2 csv files being joined at a time. Reapply iteratively if needed. Exiting now.';
exit;
} else {
for ($x = 0; $x <= 1; $x++) {
//get csv file to array
${'file' . $x} = fopen($csv_input_file_array[$x], "r"); //Dynamic variables using braces: https://stackoverflow.com/a/9257536/9095603
while (!feof(${'file' . $x})) {
${'csv_array' . $x}[] = fgetcsv(${'file' . $x});
}
fclose(${'file' . $x});
${'csv_array' . $x} = array_filter(${'csv_array' . $x}); // gets rid of last empty array in case present
//CREATE HEADERED ARRAY
${'header' . $x} = array_shift(${'csv_array' . $x});
foreach (${'csv_array' . $x} as ${'product_data_array' . $x} {
${'headered_array' . $x}[] = array_combine(${'header' . $x}, ${'product_data_array' . $x});
}
}
// How to simulate the SQL LEFT JOIN operation using PHP arrays? https://stackoverflow.com/a/25837426
//function to simulate the left join
$left = $headered_array0;
$right = $headered_array1;
$final = array();
if (empty($right_join_on)) // if $right_join_on omitted implies $right_join_on has the same value as $left_join_on
$right_join_on = $left_join_on;
foreach ($left AS $k => $v) {
$final[$k] = $v; //basically keeping everything; $left just becomes $final
foreach ($right AS $kk => $vv) {
if ($v[$left_join_on] == $vv[$right_join_on]) {
foreach ($vv AS $key => $val)
$final[$k][$key] = $val;
} else {
foreach ($vv AS $key => $val)
$final[$k][$key] = NULL;
}
}
}
if ($html_preview == 'y') {
echo '<pre>';
var_dump($final);
echo '</pre>';
}
// REINSTATE HEADERS
// var_dump($final[0]);
$indented_header = array(
0 => array_keys($final[0])
);
$re_headered_array = array_merge($indented_header, $final);
// write array to csv file
$file2 = fopen($csv_output_file, "w");
foreach ($re_headered_array as $line) {
fputcsv($file2, $line);
}
fclose($file2);
}
}
Call to function my_csv_join()
:
my_csv_join(array('C:\xampp\htdocs\kalugi\file0.csv','C:\xampp\htdocs\kalugi\file1.csv'), 'C:\xampp\htdocs\kalugi\file2.csv','y','Manufacturer ID','mpn');
Actual result shows not all records matching on the file0."Manufacturer ID" = file1.mpn
are being matched. Thus some expected rows that satisfied the join condition were not joined. We have NULL
in place of their values:
var_dump
of result:
array(3) {
[0]=>
array(8) {
["Manufacturer ID"]=>
string(6) "SKU231"
["image"]=>
string(10) "image1.jpg"
["description"]=>
string(6) "A box."
["mpn"]=>
NULL
["length"]=>
NULL
["height"]=>
NULL
["width"]=>
NULL
["title"]=>
NULL
}
[1]=>
array(8) {
["Manufacturer ID"]=>
string(6) "SKUAG1"
["image"]=>
string(11) "image22.jpg"
["description"]=>
string(12) "Another box."
["mpn"]=>
NULL
["length"]=>
NULL
["height"]=>
NULL
["width"]=>
NULL
["title"]=>
NULL
}
[2]=>
array(8) {
["Manufacturer ID"]=>
string(6) "SKU21D"
["image"]=>
string(11) "image7a.png"
["description"]=>
string(12) "A third box."
["mpn"]=>
string(6) "SKU21D"
["length"]=>
string(1) "5"
["height"]=>
string(1) "8"
["width"]=>
string(1) "5"
["title"]=>
string(5) "Box 3"
}
}
Result as written out to $file2 (file2.csv):
+-----------------+-------------+--------------+--------+--------+--------+-------+-------+ | Manufacturer ID | image | description | mpn | length | height | width | title | +-----------------+-------------+--------------+--------+--------+--------+-------+-------+ | SKU231 | image1.jpg | A box. | | | | | | | SKUAG1 | image22.jpg | Another box. | | | | | | | SKU21D | image7a.png | A third box. | SKU21D | 5 | 8 | 5 | Box 3 | +-----------------+-------------+--------------+--------+--------+--------+-------+-------+
Why are 2 of the rows not joined despite file0."Manufacturer ID" = file1.mpn
being satisfied?
I'm going to simplify this down to an SSCCE so it's easy to understand and reproduce.
I think you're over-complicating the solution here which is probably what's causing you so many weird bugs. The solution is actually rather simple. All you need to do is take each csv file, parse it into rows and columns, and then merge each of those rows into one row to write it back out to another csv.
This is the simplest way to do it:
$file0 = <<<'FILE0'
Manufacturer ID,image,description
SKU231,image1.jpg,A box.
SKUAG1,image22.jpg,Another box.
SKU21D,image7a.png,A third box.
FILE0;
$file1 = <<<'FILE1'
mpn,length,height,width,title
SKU231,22,14,10,Box 1
SKUAG1,12,6,6,Box 2
SKU21D,5,8,5,Box 3
FILE1;
$csv1 = array_map('str_getcsv', explode("\n", $file0));
$csv2 = array_map('str_getcsv', explode("\n", $file1));
$fd = fopen("/tmp/test.csv", 'w');
foreach ($csv1 as $row => $columns) {
$newRow = array_merge($columns, $csv2[$row]);
fputcsv($fd, $newRow);
}
$fd = fopen("/tmp/test.csv", "r");
while (($line = fgets($fd)) !== false) {
echo $line;
}
The result:
"Manufacturer ID",image,description,mpn,length,height,width,title SKU231,image1.jpg,"A box.",SKU231,22,14,10,"Box 1" SKUAG1,image22.jpg,"Another box.",SKUAG1,12,6,6,"Box 2" SKU21D,image7a.png,"A third box.",SKU21D,5,8,5,"Box 3"
|Manufacturer ID | image | description | mpn | length | height | width | title | |----------------|-------------|--------------|--------|--------|--------|-------|-------| | SKU231 | image1.jpg | A box. | SKU231 | 22 | 14 | 10 | Box 1 | | SKUAG1 | image22.jpg | Another box. | SKUAG1 | 12 | 6 | 6 | Box 2 | | SKU21D | image7a.png | A third box. | SKU21D | 5 | 8 | 5 | Box 3 |