Search code examples
wekadata-miningapriori

Weka Apriori Algorithm convert dataset


How can I use this dataset with Weka for Apriori Algorithm ?

'A, C, D',
'B, C, E',
'A, B, C, E',
'B, E'

Solution

  • You need to convert it in .arff format.

    The format of an .arff file is simple, is composed by three fields:

    @relation
    
    @attribute
    
    @data
    

    In case like this, where you have only a single field ("letters" in your case) you should list all the possible attribute (A,B,C,..) in the attribute field, and then format it (in data field) using boolean values describing presence/absence of the specific attribute in each line.

    Example:

    @relation <file_name>
    
    @attribute 'A' { t}
    @attribute 'B' { t}
    @attribute 'C' { t}
    @attribute 'D' { t}
    @attribute 'E' { t}
    
    @data
    t, ?, t, t, ?
    ?, t, t, ?, t
    t, t, t, ?, t
    ?, t, ?, ?, t
    

    As an other example, look at the example of "supermarket.arff" in Weka data folder.