Search code examples
machine-learningwekadecision-treearffj48

WEKA using class values to solve decision tree?


I am brand new to WEKA and ML, so please excuse my ignorance with the following. I've wasted several hours trying to figure it out, so hopefully someone could point me in the right direction:

I am trying to run a J48 decision tree on data for USDJPY. The data was loaded via .csv file and the class value is of nominal type, more specifically a value of TRUE or FALSE if USDJPY was trading more than 1% higher after 20 sessions. The problem is, When I run the algorithm, the decision tree is simply using the class value to solve the problem, which is useless. There are *22 attributes other than the class attribute from which I am looking to predict the class attribute.

When comparing my dataset to the example "glass" dataset, I cannot find any difference between the two that would explain my problem. "glass.arff" works as expected when I run J48 (with identical settings) by trying to predict the class value (type of glass) via the other attributes (ie it gets some guesses wrong).

What am I missing here? here is a list of the attributes:

@ATTRIBUTE date NUMERIC
@ATTRIBUTE open NUMERIC
@ATTRIBUTE high NUMERIC
@ATTRIBUTE low NUMERIC
@ATTRIBUTE close NUMERIC
@ATTRIBUTE 1daypctchg NUMERIC
@ATTRIBUTE smavg50onclose NUMERIC
@ATTRIBUTE smavg100onclose NUMERIC
@ATTRIBUTE smavg200onclose NUMERIC
@ATTRIBUTE ubb2 NUMERIC
@ATTRIBUTE bollma2 onclose NUMERIC
@ATTRIBUTE lbb2 NUMERIC
@ATTRIBUTE bollwjpybgn NUMERIC
@ATTRIBUTE %bjpybgn NUMERIC
@ATTRIBUTE rsi NUMERIC
@ATTRIBUTE ma50>100 {FALSE,TRUE}
@ATTRIBUTE ma50>200 {FALSE,TRUE}
@ATTRIBUTE ma100>200 {FALSE,TRUE}
@ATTRIBUTE up1pct5d? {FALSE,TRUE}
@ATTRIBUTE up1pct20d? {FALSE,TRUE}
@ATTRIBUTE dwn1pct5d? {FALSE,TRUE}
@ATTRIBUTE dwn1pct20d? {FALSE,TRUE}

Solution

  • Weka (and its J48 implementation) should be able to classify your data as long as the ground-truth class is consistently in the same column of your .csv file.