Search code examples
javamachine-learningweka

Relational data in Weka?


I have data in this format: I have tuples of the amount of data and the processing time per function. I want to do a classification by the "class" attribute

Here is a sample:

Amount-F1 Time-F1 Amount-F2 Time-F2 [...] Class
50         10      20        10            1
20         2       100       20            3
...

How should I build the arff file? Should I use the relational attribute for the (Ammount,Time) tuples or should I use "regular" attributes?

Can you make me a sample arff file for my example please?

Thank you


Solution

  • WEKA can also work with csv files. But if you want to use arff-format, WEKA does support: weka.core.converters.ArffSaver / weka.core.converters.CSVLoader

    You could split each tuple into 2 separate features:

    @RELATION yourTable
    
    @ATTRIBUTE Amount-F1 NUMERIC
    @ATTRIBUTE Time-F1 NUMERIC
    @ATTRIBUTE Amount-F2 NUMERIC
    @ATTRIBUTE Time-F2 NUMERIC
    @ATTRIBUTE Amount-F3 NUMERIC
    @ATTRIBUTE Time-F3 NUMERIC
    ...
    @ATTRIBUTE Class {1,2,3} % your class labels
    
    @DATA
    50, 10, 20, 10, 1
    20, 2, 100, 20, 3
    ...
    

    or use an Aggregation of Time and Amount:

    Amount-F1 and Time-F1 as F1:

    @RELATION yourTable
    
    @ATTRIBUTE F1 NUMERIC
    @ATTRIBUTE F2 NUMERIC
    @ATTRIBUTE F3 NUMERIC
    ...
    @ATTRIBUTE Class {1,2,3} % your class labels
    
    @DATA
    5, 2, 1    % 50/10, 20/10, 1
    10, 5, 3   % 20/2, 100/20, 3
    ...
    

    Depending on the use-case, generally I would prefer the second option.