At the moment I have a script that will remove the row in my csv when it has already seen the sku before.
Here is the script:
<?php
// array to hold all unique lines
$lines = array();
// array to hold all unique SKU codes
$skus = array();
// index of the `sku` column
$skuIndex = -1;
// open the "save-file"
if (($saveHandle = fopen("unique.csv", "w")) !== false) {
// open the csv file
if (($readHandle = fopen("original.csv", "r")) !== false) {
// read each line into an array
while (($data = fgetcsv($readHandle, 8192, ",")) !== false) {
if ($skuIndex == -1) {
// we need to determine what column the "sku" is; this will identify
// the "unique" rows
foreach ($data as $index => $column) {
if ($column == 'sku') {
$skuIndex = $index;
break;
}
}
if ($skuIndex == -1) {
echo "Couldn't determine the SKU-column.";
die();
}
// write this line to the file
fputcsv($saveHandle, $data);
}
// if the sku has been seen, skip it
if (isset($skus[$data[$skuIndex]])) continue;
$skus[$data[$skuIndex]] = true;
// write this line to the file
fputcsv($saveHandle, $data);
}
fclose($readHandle);
}
fclose($saveHandle);
}
?>
This works fine but I am starting to need the content that is deleted.
What i need now, is to modify the code to add the same prefix to all duplicate sku's as there will only be 2 of the same sku.
I do not know where to start.
This will add a prefix to any duplicate SKU and will then store it into the unique CSV output, e.g. XYZ123
becomes duplicate-XYZ123
.
Change:
if (isset($skus[$data[$skuIndex]])) continue;
to:
if (isset($skus[$data[$skuIndex]])) $data[$skuIndex] = 'duplicate-' . $data[$skuIndex];
Add continue;
after fputcsv($saveHandle, $data);
Inside if ($skuIndex == -1) {
. Because fputcsv(...)
appears twice in the loop, it will be run twice on the first iteration of the loop.