Search code examples
algorithmprologdata-miningswi-prologlogic-programming

WARMR algorithm in ALEPH (SWI-Prolog)


i am trying to use WARMR to find frequent relational patterns in my data; for this i am using ALEPH in SWI-Prolog. however, i am struggling to figure out how to do this and why my previous attempts did not work.

i want to make a toy example work before i move on to my full data. for this i took the toy "train" data from the aleph pack page: http://www.swi-prolog.org/pack/list?p=aleph

the Aleph manual states about the ar search:

ar Implements a simplified form of the type of association rule search conducted by the WARMR system (see L. Dehaspe, 1998, PhD Thesis, Katholieke Universitaet Leuven). Here, Aleph simply finds all rules that cover at least a pre-specified fraction of the positive examples. This fraction is specified by the parameter pos_fraction.

accordingly i have inserted

:- set(search,ar).
:- set(pos_fraction,0.01). 

into the background file (and deleted :- set(i,2).)) and erased the .n file of negative examples. i have also commented out all the determinations and the modeh declaration logic being that we are searching for frequent patterns, not rules (i.e. in a supervised context head would be an "output" variable and clauses in the body -- "inputs" trying to explain the output), i.e. it is an unsupervised task.

now, the original trains dataset is trying to construct rules for "eastbound" trains. this is done by having predicates like car, shape, has_car(train, car) etc. originally all the background knowledge relating to these is located in the .b file and the five positive examples (e.g eastbound(east1).) in the .f file (+ five negative examples, e.g. eastbound(west1)., in the .n file). leaving files unchanged (save for the changes described above) and running induce. does not produce a sensible result (it would return ground terms like train(east1) as a "rule", for example). i have tried moving some of the background knowledge to the .f file but that did not produce anything sensible either.

how do i go about constructing the .f and .b files? what should to into the positive examples file if we are not really looking to explain any positive examples (which would surely constitute a supervised problem) but instead to find frequent patterns in the data (unsupervised problem)? am i missing something?

any help would be greatly appreciated.


Solution

  • First of all if you can use the original WARMR I think it is better. But I think you need to be an academic for free use. You can try asking for a license. https://dtai.cs.kuleuven.be/ACE/

    To get association rules, I put all the examples I want in the f file. The n file can have examples in it or I think be empty.

    The only thing I change is to put :

     :- set(search,ar).
     :- set(pos_fraction,0.01).  
    

    In the .b file. Keep the determinations and mode declarations.

    The set(i,2) limits the length of the query to having two additional literals (I think) so you might want this to be larger.

    ?-read_all(train). induce.

    You will then get an out of 'good clauses' which I think are the frequent queries.

    [good clauses] eastbound(A). [pos cover = 5 neg cover = 0] [pos-neg] [5] eastbound(A) :- has_car(A,B). [pos cover = 5 neg cover = 0] [pos-neg] [5] eastbound(A) :- has_car(A,B). [pos cover = 5 neg cover = 0] [pos-neg] [5] eastbound(A) :- has_car(A,B). [pos cover = 5 neg cover = 0] [pos-neg] [5] eastbound(A) :- has_car(A,B). [pos cover = 5 neg cover = 0] [pos-neg] [5] eastbound(A) :- has_car(A,B), long(B). [pos cover = 2 neg cover = 0] [pos-neg] [2] eastbound(A) :- has_car(A,B), open_car(B). [pos cover = 5 neg cover = 0] [pos-neg] [5] eastbound(A) :- has_car(A,B), shape(B,rectangle). [pos cover = 5 neg cover = 0] [pos-neg] [5] eastbound(A) :- has_car(A,B), wheels(B,2). [pos cover = 5 neg cover = 0] [pos-neg] [5] eastbound(A) :- has_car(A,B), load(B,rectangle,3). [pos cover = 1 neg cover = 0] [pos-neg] [1] eastbound(A) :- has_car(A,B), has_car(A,C). [pos cover = 5 neg cover = 0] [pos-neg] [5] eastbound(A) :- has_car(A,B), has_car(A,C). [pos cover = 5 neg cover = 0] [pos-neg] [5] eastbound(A) :- has_car(A,B), has_car(A,C). [pos cover = 5 neg cover = 0] [pos-neg] [5] eastbound(A) :- has_car(A,B), short(B). [pos cover = 5 neg cover = 0] [pos-neg] [5] eastbound(A) :- has_car(A,B), closed(B). [pos cover = 5 neg cover = 0] [pos-neg] [5] eastbound(A) :- has_car(A,B), shape(B,rectangle). [pos cover = 5 neg cover = 0] [pos-neg] [5] eastbound(A) :- has_car(A,B), wheels(B,2). [pos cover = 5 neg cover = 0] [pos-neg] [5] eastbound(A) :- has_car(A,B), load(B,triangle,1). [pos cover = 5 neg cover = 0] [pos-neg] [5] eastbound(A) :- has_car(A,B), has_car(A,C). [pos cover = 5 neg cover = 0] [pos-neg] [5] eastbound(A) :- has_car(A,B), has_car(A,C). [pos cover = 5 neg cover = 0] [pos-neg] [5] eastbound(A) :- has_car(A,B), long(B). [pos cover = 2 neg cover = 0] [pos-neg] [2] eastbound(A) :- has_car(A,B), open_car(B). [pos cover = 5 neg cover = 0] [pos-neg] [5] eastbound(A) :- has_car(A,B), shape(B,rectangle). [pos cover = 5 neg cover = 0] [pos-neg] [5] eastbound(A) :- has_car(A,B), wheels(B,3). [pos cover = 3 neg cover = 0] [pos-neg] [3] eastbound(A) :- has_car(A,B), load(B,hexagon,1). [pos cover = 1 neg cover = 0] [pos-neg] [1] eastbound(A) :- has_car(A,B), has_car(A,C). [pos cover = 5 neg cover = 0] [pos-neg] [5] eastbound(A) :- has_car(A,B), short(B). [pos cover = 5 neg cover = 0] [pos-neg] [5] eastbound(A) :- has_car(A,B), open_car(B). [pos cover = 5 neg cover = 0] [pos-neg] [5] eastbound(A) :- has_car(A,B), shape(B,rectangle). [pos cover = 5 neg cover = 0] [pos-neg] [5] eastbound(A) :- has_car(A,B), wheels(B,2). [pos cover = 5 neg cover = 0] [pos-neg] [5] eastbound(A) :- has_car(A,B), load(B,circle,1). [pos cover = 3 neg cover = 0] [pos-neg] [3] eastbound(A) :- has_car(A,B), open_car(B), shape(B,rectangle). [pos cover = 4 neg cover = 0] [pos-neg] [4]

    etc etc

    The rules are of the form eastbound(A):-blah blah. But it is only counting the eastbound examples. So think of this as example_covered(A):-blah blah