Search code examples
javaneural-networkxorencog

How to provide strings as inputs and outputs in encog XOR function?


I need to make encog program in Java with XOR function that must have string words with definitions as inputs but BasicMLDataSet can only receive doubles. Here is the sample code that I am using:

/**
 * The input necessary for XOR.
 */
public static double XOR_INPUT[][] = { { 0.0, 0.0 }, { 1.0, 0.0 },
        { 0.0, 1.0 }, { 1.0, 1.0 } };

/**
 * The ideal data necessary for XOR.
 */
public static double XOR_IDEAL[][] = { { 0.0 }, { 1.0 }, { 1.0 }, { 0.0 } };

And here is the class that receives XOR_INPUT and XOR_IDEAL:

MLDataSet trainingSet = new BasicMLDataSet(XOR_INPUT, XOR_IDEAL);

The code is from encog xor example

Is there any way that I can acomplish training with strings or parse them somehow and then return them to strings before writing them to console?


Solution

  • I have found a work around for this. As I can only provide double values between 0 and 1 as inputs and as I haven't found any function in encog that can naturally normalize string to double values I have made my own function. I'm getting ascii value from every letter in word and then I'm simply dividing 90/asciiValue to get value between 0 and 1. Keep in mind that this only works for small letters. Function can be easily upgraded to support upper letters also. Here is the function:

        //Converts every letter in string to ascii and normalizes it (90/asciiValue)    
         public static double[] toAscii(String s, int najveci) {
                double[] ascii = new double[najveci];
                try {
                        byte[] bytes = s.getBytes("US-ASCII");
                        for (int i = 0; i < bytes.length; i++) {
                                ascii[i] = 90.0 / bytes[i];
                        }
    
                } catch (UnsupportedEncodingException e) {
                        e.printStackTrace();
                }
                return ascii;
        }
    

    For word ideal output I'm using similar solution. I'm also normalizing each letter in word but then I make average of those values. Later, I'm denormalizing those values to get strings back and check model training goodnes.

    You can view full code here.