Search code examples
appendweka

appending two files in Weka


I have two datasets. Basically, they are two .arff file.

Fold1.arff contains:

@relation iris

@attribute sepallength numeric
@attribute sepalwidth numeric
@attribute petallength numeric
@attribute petalwidth numeric
@attribute class {Iris-setosa,Iris-versicolor,Iris-virginica}

@data
5.1,3.5,1.4,0.2,Iris-setosa
5.4,3.7,1.5,0.2,Iris-setosa
5.4,3.4,1.7,0.2,Iris-setosa
4.8,3.1,1.6,0.2,Iris-setosa
5,3.5,1.3,0.3,Iris-setosa
7,3.2,4.7,1.4,Iris-versicolor
5,2,3.5,1,Iris-versicolor
5.9,3.2,4.8,1.8,Iris-versicolor
5.5,2.4,3.8,1.1,Iris-versicolor
5.5,2.6,4.4,1.2,Iris-versicolor
6.3,3.3,6,2.5,Iris-virginica
6.5,3.2,5.1,2,Iris-virginica
6.9,3.2,5.7,2.3,Iris-virginica
7.4,2.8,6.1,1.9,Iris-virginica
6.7,3.1,5.6,2.4,Iris-virginica

Fold2.arff contains

@relation iris

@attribute sepallength numeric
@attribute sepalwidth numeric
@attribute petallength numeric
@attribute petalwidth numeric
@attribute class {Iris-setosa,Iris-versicolor,Iris-virginica}

@data
4.9,3,1.4,0.2,Iris-setosa
4.8,3.4,1.6,0.2,Iris-setosa
5.1,3.7,1.5,0.4,Iris-setosa
5.4,3.4,1.5,0.4,Iris-setosa
4.5,2.3,1.3,0.3,Iris-setosa
6.4,3.2,4.5,1.5,Iris-versicolor
5.9,3,4.2,1.5,Iris-versicolor
6.1,2.8,4,1.3,Iris-versicolor
5.5,2.4,3.7,1,Iris-versicolor
6.1,3,4.6,1.4,Iris-versicolor
5.8,2.7,5.1,1.9,Iris-virginica
6.4,2.7,5.3,1.9,Iris-virginica
5.6,2.8,4.9,2,Iris-virginica
7.9,3.8,6.4,2,Iris-virginica
6.9,3.1,5.1,2.3,Iris-virginica

Now I try to append them using the command:

java weka.core.Instances append d:\fold1.arff d:\fold2.arff > d:\result.arff

I run the command from the Weka simple CLI field.

I got this error:

Usage:
weka.core.Instances help
    Prints this help
weka.core.Instances <filename>
    Outputs dataset statistics
weka.core.Instances merge <filename1> <filename2>
    Merges the datasets (must have same number of rows).
    Generated dataset gets output on stdout.
weka.core.Instances append <filename1> <filename2>
    Appends the second dataset to the first (must have same number of attributes).
    Generated dataset gets output on stdout.
weka.core.Instances headers <filename1> <filename2>
    Compares the structure of the two datasets and outputs whether they
    differ or not.
weka.core.Instances randomize <seed> <filename>
    Randomizes the dataset and outputs it on stdout.

My two files have same number of rows as you can see from the above example. Then why the result.arff file could not create?


Solution

  • I can see in your two files that the attributes are same and has same number of attributes (5 in total)

    From the official documentation the current syntax for append is

        weka.core.Instances append <filename1> <filename2>
    

    This appends filename2 to filename1. No need to specify the output file. That is all changes are stored in filename1 itself

    Note It’s ideal to pass file name in quotes (single or double quotes) especially in scenarios when your filename has blank space (eg., fold 1.arff instead of fold1.arff)

          java weka.core.Instances append “d:\fold1.arff” “d:\fold2.arff”
    

    In few languages, blackslash is an escape sequence and so has to be used twice

          java weka.core.Instances append “d:\\fold1.arff” “d:\\fold2.arff”