I am using Prolog and I am trying to find the intersection or the common elements between two lists and the result should not contain duplicates. In addition, the case of lists with different lengths should be handled. The result of the predicate should be as follows:
?-no_duplicates_intersection([a,v,a,c],[a,a,a,a,a],L).
L = a.
Actually, I found a question or two tackling the same issue, but the answers were way too long. I was wondering if there was a more straightforward and easier method using the following predicate, which returns the intersection between two lists with duplicates:
intersection_with_dulpicates([], [], []).
intersection_with_dulpicates([],M,[]).
intersection_with_dulpicates([X|Y],M,[X|Z]):-
member(X,M),
intersection_with_dulpicates(Y,M,Z).
intersection_with_dulpicates([X|Y],M,Z):-
\+member(X,M),
intersection_with_dulpicates(Y,M,Z).
Taking advantage of the built-in sort (which also removes duplicates):
intersection_without_duplicates(Lst1, Lst2, Intersection) :-
% Sort and remove duplicates from both
% The built-in sort is quick
sort(Lst1, Lst1Sorted),
sort(Lst2, Lst2Sorted),
intersect_sorted(Lst1Sorted, Lst2Sorted, Intersection).
intersect_sorted([], _Lst2Sorted, []).
intersect_sorted([H|T], LstSorted, Intersection) :-
( member_listsorted(H, LstSorted)
-> Intersection = [H|Intersection0]
; Intersection0 = Intersection
),
intersect_sorted(T, LstSorted, Intersection0).
member_listsorted(H, LstSorted) :-
member_listsorted_(LstSorted, H).
member_listsorted_([H|T], Elem) :-
( H @< Elem
-> member_listsorted_(T, Elem)
; H = Elem
).
Sample output in swi-prolog:
?- time(intersection_without_duplicates([a, b, c, d, b, c, d], [b, c, b, c, d],
I)).
% 31 inferences, 0.000 CPU in 0.000 seconds (89% CPU, 586277 Lips)
I = [b,c,d].
?- numlist(1, 10000, Lst1), numlist(5000, 12345, Lst2), time((intersection_without_duplicates(Lst1, Lst2, Intersection))).
% 25,060,003 inferences, 1.313 CPU in 1.297 seconds (101% CPU, 19090034 Lips)
Performance comparison with @TessellatingHeckler's suggestion:
?- numlist(1, 10000, Lst1), numlist(5000, 12345, Lst2), time((intersection(Lst1, Lst2, Both), sort(Both, Answer))).
% 35,001 inferences, 2.193 CPU in 2.167 seconds (101% CPU, 15957 Lips)