Search code examples
phparraysmultidimensional-arrayduplicatesgrouping

Remove duplicate rows from 2d array, but preserve rows with non-empty value in a specific column


I have a 2d array containing rows where a user might be represented more than once. I need to remove duplicate instances of user data, but I don't want to lose any meaningful data in the process. Rows with a non-empty value in the Flying Tour column should be prioritized over a row with an empty value in the same column.

Sample data:

$data = [
    [
        'Access ID' => 12345,
        'Registration Date' => '2018-02-27',
        'First Name' => 'Damian',
        'Last Name' => 'Martin',
        'Flying Tour' => ''
    ],
    [
        'Access ID' => 12345,
        'Registration Date' => '2018-02-27',
        'First Name' => 'Damian',
        'Last Name' => 'Martin',
        'Flying Tour' => 'Yes going'
    ],
    [
        'Access ID' => 789456,
        'Registration Date' => '2018-03-27',
        'First Name' => 'Ricky',
        'Last Name' => 'Smith',
        'Flying Tour' => ''
    ],
    [
        'Access ID' => 789456,
        'Registration Date' => '2018-03-27',
        'First Name' => 'Ricky',
        'Last Name' => 'Smith',
        'Flying Tour' => 'Two way going',
    ],
    [
        'Access ID' => 987654,
        'Registration Date' => '2018-04-27',
        'First Name' => 'Darron',
        'Last Name' => 'Butt',
        'Flying Tour' => ''
    ]
];

My code:

$results = [];
      foreach($data as $input){

      $isDuplicate = false;
      foreach($results as $result){
        if(
            strtolower($input['First Name'])===strtolower($result['First Name']) &&
            strtolower($input['Last Name'])===strtolower($result['Last Name'])      &&
            strtolower($input['Registration ID'])===strtolower($result['Registration ID']) &&
            strtolower(!empty($input['Flying Tour']))
        ){
            //a duplicate was found in results
            $isDuplicate = true;
            break;
        }
      }
      //if no duplicate was found
      if(!$isDuplicate) $results[]=$input;
}

Desired result:

Array
(
    [0] => Array
        (
            [Access ID] => 12345
            [Registration Date] => 2018-02-27
            [First Name] => Damian
            [Last Name] => Martin
            [Flying Tour] => Yes going
        )

    [1] => Array
        (
            [Access ID] => 789456
            [Registration Date] => 2018-03-27
            [First Name] => Ricky
            [Last Name] => Smith
            [Flying Tour] => Two way going
        )

    [2] => Array
        (
            [Access ID] => 987654
            [Registration Date] => 2018-04-27
            [First Name] => Darron
            [Last Name] => Butt
            [Flying Tour] => 
        )

)

Some changes are made please see


Solution

  • Use foreach() along with array keys to check for duplicates:

    $results = [];
    foreach ($data as $input) {
        if (!isset($results[$input['Access ID'] . '_' . $input['First Name'] . '_' . $input['Last Name']])) {
                $results[$input['Access ID'] . '_' . $input['First Name'] . '_' . $input['Last Name']] = $input;
        } else {
            if ($results[$input['Access ID'] . '_' . $input['First Name'] . '_' . $input['Last Name']]['Flying Tour'] == '') {
                $results[$input['Access ID'] . '_' . $input['First Name'] . '_' . $input['Last Name']] = $input;
            }
        }
    }
    
    $results = array_values($results);
    //array_multisort( array_column($results, "First Name"), SORT_ASC, $results );
    echo "<pre/>";
    print_r($results);
    

    Sample Output: https://3v4l.org/2KcSN