@attribute CustomerID NUMERIC
@attribute Age {A,B,C,D,E,F,G,H,I,J,K}
@attribute Region {A,B,C,D,E,F,G,H}
@attribute ProductSubClass NUMERIC
@attribute ProductID NUMERIC
@attribute Quantity NUMERIC
@attribute Cost NUMERIC
@attribute sales NUMERIC
@data
00141833,F,F,130207,4710105011011,2,44,52
01376753,E,E,110217,4710265849066,1,150,129
01603071,E,G,100201,4712019100607,1,35,39
01738667,E,F,530105,4710168702901,1,94,119
above is header and a protion of trianing dataset training.arff file I want to use Kmeans clustering and J48 classifier, and I can do it without any problems. and flowing is my test dataset test.arff
@attribute CustomerID NUMERIC
@attribute Age {A,B,C,D,E,F,G,H,I,J,K}
@attribute Region {A,B,C,D,E,F,G,H}
@attribute ProductSubClass NUMERIC
@attribute ProductID INTEGER
@attribute Quantity NUMERIC
@attribute Cost NUMERIC
@attribute sales NUMERIC
@data
1754698,H,A,560402,?,1,676,849
1027365,F,C,530404,?,1,170,219
956710,E,E,500303,?,1,36,59
In both case I ensured ProductID is selected as Class
here is the steps I did
Setp1: assigning "AddCluster" to use K-means clusterig for each instance in the dataset
step2: and then using J48 classificaion algorithm to evaluate the performance of the clustering algorithms using 10-fold cross validation option
Step3: save Finalized Model and close weka (I am closing to test if I can relode and use it agian)
Step4:Load the Model in weaka (Useing "Load Model")
step5: This time I am selecting "supplied test set" and select test file to predict (which is same formate as I mentioned in the questien above)
step6: I am trying "Re-evaluate model on current test set"
But here I am getting a notificaion "Data used to train mode test set are not compatible.would you like to automiatically wrap the classifier in an "inputMappedClassifier before proceeding ?"" If I click "NO" it shows "Train and test set are not compatible ... 5 != 6" and if "YES" it gives following output inthe plainText:
=== Predictions on user test set ===
inst# actual predicted error prediction
1 ? 0 ?
2 ? 0 ?
3 ? 0 ?
4 ? 0 ?
5 ? 0 ?
6 ? 0 ?
7 ? 0 ?
8 ? 0 ?
9 ? 0 ?
10 ? 0 ?
11 ? 0 ?
12 ? 0 ?
13 ? 0 ?
14 ? 0 ?
15 ? 0 ?
16 ? 1 ?
17 ? 0 ?
18 ? 0 ?
19 ? 0 ?
20 ? 0 ?
21 ? 0 ?
Now 1. Is it possible to Using Numeric field ProductID as a class because I have to predict customer choice of product based on ProductID under consideration other attributes.
NOTE: I am using Weka 3.8.1 GUI
Possibly, your test dataset is missing the cluster-id that the K-Means clustering operation might have added to the training set (Did you tell Weka to do so?), but did not not add to the test data set.
Aside from that, the whole point of K-Means is to use it for clustering and not for classification.
So frankly, you are applying things incorrectly, not giving us readers enough information (J48?), and asking (at least) two questions here.