Search code examples
prologdeclarative

Splitting a prolog list based on a delimiter from the list?


In Prolog, let's say I have a list such as

[fun, joke1, joke2, fun, joke3, fun, joke4, joke5, joke6]

I'm trying to build a list of lists that will result to

[ [joke1, joke2], [joke4, joke5, joke6] ] 

using fun as the delimiter and ignoring a built list of length 1, hence

[joke3] 

is not inside that list.

I tried using split_string but it doesn't work for what I need to obtain. I've also tried recursion using append but that didn't work out as well. Hoping I can be pointed to the right direction.

Thanks!


Solution

  • Here is a solution which uses two predicates:

    split_on_delimiter(L, D, S) :-
        split_on_delimiter_(L, D, R),
        findall(X, (member(X, R), length(X,Length), Length > 1), S).
    
    split_on_delimiter_([], _, [[]]).    
    split_on_delimiter_([D|T], D, [[]|T2]) :-
        split_on_delimiter_(T, D, T2).
    split_on_delimiter_([H|T], D, [[H|T2]|T3]) :-
        dif(H, D),
        split_on_delimiter_(T, D, [T2|T3]).
    

    split_on_delimiter_/3 is the predicate which really splits the list based on the delimiter D. split_on_delimiter/3 is the predicate that you would call and which will in turn call split_on_delimiter_/3.

    To keep only the sublists of length more than 1, we use findall to find all sublists of the result of the split which have Length > 1.

    split_on_delimiter_/3 itself is fairly simple:

    • First rule: spliting an empty list results in only one sublist: the empty list.
    • Second rule: when the first element of the list is the delimiter, the result is the empty list followed by the result of the recursive call.
    • Third rule: when the first element of the list is not the delimiter (dif(H, D)), then we put that element at the beginning of the first sublist and recursive call.

    An example:

    ?- split_on_delimiter([fun, joke1, joke2, fun, joke3, fun, joke4, joke5, joke6], fun, Z).
    Z = [[joke1, joke2], [joke4, joke5, joke6]] ;
    false.
    

    Note

    split_on_delimiter_/3 has extraneous choice points (which is why you can press ; after the first result, because it thinks there can be more answers but there are none). You could solve this using ! or once.

    A better solution to remove those choice points is using if_/3 and (=)/3 of module(reif) (though I doubt this is useful to you):

    split_on_delimiter_(L, D, [L2|T2]) :-
        if_(L = [],
            [L2|T2] = [[]],
            (   L = [H|T],
                if_(H = D,
                    (   L2 = [],
                        split_on_delimiter_(T, D, T2)
                    ),
                    (   L2 = [H|T3],
                        split_on_delimiter_(T, D, [T3|T2])
                    )
                )
            )
        ).