Below are two arrays from two different feeds, they share different ids. Because of this, I have to rely on 'BriefTitle': I can tell by the 'BriefTitle' and other data (eg [LocationCountry], [StartDate], [Condition]) that this is same record. I would like to take substr of 'BriefTitle' to compare it to other 'BriefTitle' records to filter out duplicates, since they are contained in each other. I am not looking for an exact match, which is what I've been finding for most solutions here.
I like the short solution proposed by sevavietl/ mickmackusa: php remove duplicates from multidimensional array by value
$result = array_reverse(array_values(array_column(
array_reverse($data),
null,
'BriefTitle'
)));
however, my 'BriefTitle' is an array (doesn't seem to work with array_column), and I am not sure how to apply substr function to the solution above.
Some quick notes:
Any thoughts how I should approach this? The arrays:
[0] => Array
(
[Rank] => 422
[id] => Array
(
[0] => 152091
)
[Condition] => Array
(
[0] => Depression
[1] => Ketamine
)
[BriefTitle] => Array
(
[0] => Positron Emission Tomography Assessment of Ketamine Binding of the Serotonin Transporter
)
[LocationCountry] => Array
(
[0] => Austria
)
[StartDate] => Array
(
[0] => May 5, 2016
)
[LastUpdatePostDate] => Array
(
[0] => October 15, 2018
)
[Entheogen] => ketamine
[Source] => clinicaltrials.gov
)
[1] => Array
(
[Rank] => 6673
[id] => Array
(
[0] => YSBSZ18291
)
[Condition] => Array
(
[0] => Depressive Disorder
[1] => Ketamine
)
[BriefTitle] => Array
(
[0] => Positron Emission Tomography assessment of Ketamine Binding of the Serotonin Transporter and its Relevance for Rapid Antidepressant Response
[1] => Die Rolle des Serotonintransporters bei der akuten antidepressiven Wirkung von Ketamin, untersucht mit Positronen-Emissions-Tomographie
)
[LocationCountry] => Array
(
[0] => Austria
)
[StartDate] => Array
(
[0] => 2016 05 01
)
[LastUpdatePostDate] => Array
(
[0] => 2018 10 15
)
[Entheogen] => ketamine
[Source] => clinicaltrialsregister.eu
)
Unfortunately because of the nature of your data (strings which match may be substrings of others, with different case) the only real option is to brute-force this. Loop over the array, storing titles as you go and checking whether the current title matches any of them:
$result = array();
$brieftitles = array();
foreach ($array as $arr) {
$foundtitle = false;
$title = $arr['BriefTitle'][0];
foreach ($brieftitles as $btitle) {
$foundtitle = (stripos($title, $btitle) !== false) || (stripos($btitle, $title) !== false);
if ($foundtitle) break;
}
if (!$foundtitle) {
$result[] = $arr;
$brieftitles[] = $arr['BriefTitle'][0];
}
}
print_r($result);