Search code examples
prolog

Why are so many correct solutions found?


Suppose I have a list of "base atoms" and a longer atom made up only of those base atoms. repeats are allowed. I whipped up the following code to generate lists of all the atoms which could possibly make up the longer atom. This code seems to work with the exception that it finds many repeats of the same (correct) solutions and I am not sure why?

basics([a, abc, def, aaa]).
  
candidate(aaaabcaaadef).

join_atoms(Atoms, Atom):- join_atoms( Atoms, '', Atom ) .
 
join_atoms( [],    Atom,      Atom ) . 
join_atoms( [H|T], AtomAccum, Atom ) :- 
    atom_concat( AtomAccum, H, UpdatedAtom) ,
    join_atoms( T, UpdatedAtom, Atom ) . 

split_atoms( C, Atoms ):- split_atoms( C, [], Atoms ) .

split_atoms( '', AtomsAccum, Atoms ) :-
    candidate(C) ,
    reverse( AtomsAccum, AtomsAccumR ) ,
    join_atoms( AtomsAccumR, C ) ,
    Atoms = AtomsAccumR .
split_atoms(C, AtomsAccum, Atoms):- 
    basics( B ) ,
    member( SubAtom, B ) ,
    sub_atom( C, _, Length, _, SubAtom ) , 
    sub_atom( C, Length, _, 0, AtomRest ) , 
    split_atoms(AtomRest, [SubAtom|AtomsAccum], Atoms).  

main:-
    candidate( C ) ,
    findall( Atoms, split_atoms(C, Atoms), AllAtoms ) ,
    sort( AllAtoms, UniqueAtoms ) , 
    write(UniqueAtoms),
    nl .

The findall/3 and sort/2 will get all solutions and remove the duplicates, of course. But without those the correct solutions are repeated multiple times.

For example (output truncated)

| ?- split_atoms(aaaabcaaadef, Atoms).

Atoms = [a,a,a,abc,a,a,a,def] ? a

Atoms = [a,a,a,abc,a,a,a,def]

Atoms = [a,a,a,abc,a,a,a,def]

Atoms = [a,a,a,abc,a,a,a,def]

Atoms = [a,a,a,abc,a,a,a,def]

Atoms = [a,a,a,abc,a,a,a,def]

Atoms = [a,a,a,abc,aaa,def]

Atoms = [a,a,a,abc,a,a,a,def]

Atoms = [a,a,a,abc,a,a,a,def]

.
.
.

Can anyone suggest why this is happening? Presumably it is backtracking more than necessary for some reason? Or perhaps my code is unintentionally creating a situation which can be minimized?


Solution

  • I think you are over-thinking the problem here.

    To paraphrase your problem statement:

    I want find all the different, distinct ways a given symbol can be composed from an alphabet of shorter symbols.

    To solve that requires a couple of observations:

    • The longer symbol must have as a prefix one of the shorter symbols contained in the alphabet
    • The above recursively holds true if you strip such a prefix off the longer symbol

    That leads to this simple solution: https://swish.swi-prolog.org/p/tLTQiVNQ.pl

    [sub_atom/5 is an ISO-standard built-in predicate that lets you disassemble atoms by offset and length]

    % --------------------------------------
    % compose( Atom, Subatoms, Composition )
    % --------------------------------------
    compose( ''   , _        , []     ) .  % the empty set composes the empty atom
    compose( Atom , Subatoms , [A|As] ) :- % otherwise...
      member(A,Subatoms) ,                 % - fish a symbol out of our alphabet
      sub_atom(Atom,0,L,P,A) ,             % - see if its a prefix of our candidate
      sub_atom(Atom,L,P,_,Nextatom) ,      % - get the next atom (the suffix)
      compose(Nextatom, Subatoms,As)       % - and recurse down
      .                                    % Easy!
    

    Given your sample data:

    ?- compose( aaaabcaaadef, [a,abc,def,aaa], Xs).
    

    The following results are produced on backtracking:

    Xs = [a, a, a, abc, a, a, a, def]
    Xs = [a, a, a, abc, aaa, def]
    Xs = [aaa, abc, a, a, a, def]
    Xs = [aaa, abc, aaa, def]
    false