Search code examples
csvmachine-learningconv-neural-networkdeeplearning4jdl4j

Parsing modified MNIST in the form of CSV for Conv Neural Network


I'm planning on using this modified version of MNIST for benchmarking research, but they are currently in .mat format. So, I've read on StackOverflow that MatlabRecordReader actually isn't that robust, and that it's far smarter to change the data into CSV format. I've downloaded Matlab and changed the .mat file to a .csv file that has 60000 (for the test data) lines, the first 784 values of each line being the pixel values of the image itself and the last 10 values being the label (though I believe I can easily condense the label into one value at the end of the first 784 values).

Now that I have this data, I'm not exactly sure how I should pass it through an Iterator properly for my Conv Neural Network. I've looked up the documentation, but this isn't exactly what I need, and looking up the examples in the the docs for the RecordReaderDatasetIterator was also a near-miss because it treats lines of the CSV files as either a 1 dimensional vector (as apposed to a matrix) or formats the data for linear regression.

I hope this has been clear enough. Could someone please assist me?



Solution

  • Use CSVRecordReader with the label appended to the end of each row as an integer with 0 to 9.

    Use convolutionalFlat as the setInputType at the bottom. Example snippet: .setInputType(InputType.convolutionalFlat(28,28,1)) .backprop(true).pretrain(false).build();

    Whole code example for the neural net config: https://github.com/deeplearning4j/dl4j-examples/blob/master/dl4j-examples/src/main/java/org/deeplearning4j/examples/convolution/LenetMnistExample.java