Search code examples
machine-learningprologdata-miningswi-prologlogic-programming

WARMR (ACE suite): eliminate "connected" terms from frequent patterns


i am using the WARMR frequent pattern algorithm in the ACE data mining suite. here is a toy example illustrating my problem.

imagine you have, say, 20 examples (example(ex1)),...,example(ex20))) and only one predicate, call it quality, so quality(E, X) means E has quality X. X can take, say, 6 values: a, b, c, d, e and f which are related: c is b and b is a (and so c is also a), f is e and e is d (and so f is also d). think graphs:

a - b - c
d - e - f

when WARMR mines for frequent patterns, once one quality in a branch/graph is included no other quality from the same branch should be allowed to be added. for example at level 3:

 example(A),quality(A,a),quality(A,d)

is a valid pattern but:

 example(A),quality(A,a),quality(A,c)

or

 example(A),quality(A,a),quality(A,b)

are not.

i have included this background knowledge in the .bk file:

bond(b,a).
bond(c,b).
bond(f,e).
bond(e,d).

no_bond(a,d).
no_bond(a,e).
no_bond(a,f).

bond(X,Y) :- bond(X,Z),bond(Z,Y).
bond(X,Y) :- bond(Y,X).

no_bond(X,Y) :- no_bond(Y,X).
no_bond(X,Y) :- no_bond(X,Z),bond(Z,Y).

and i have tried to impose the above condition via the following in the .s file:

rmode(quality(+E, #).
constraint(quality(E, Q), not_occurs(bond(Q,_))).

and

rmode(quality(+E, #).
constraint(quality(E, Q), user(X, no_bond(Q,_))).   

or

constraint(quality(E, Q), user(X, no_bond(Q,X))).

none of which worked. any help would be greatly appreciated.


Solution

  • so the following answer has been suggested to me:

    first add the following predicates to the background knowledge:

    branch1(E,X) :- quality(E,X), member(X, [a,b,c]).
    branch2(E,X) :- quality(E,X), member(X, [d,e,f]).
    

    then include these in the settings file:

    rmode(1:branch1(+E,#)).
    rmode(1:branch2(+E,#)).
    

    this solves my stated problem. however, data in my actual problem forms a directed tree, so this 'static' branches approach does not apply.