Search code examples
javascriptnode.jsneural-networkmnist

Reading MNIST dataset with javascript/node.js


I'm trying to decode the dataset from this source: http://yann.lecun.com/exdb/mnist/

There is a description of the "very simple" IDX file type in the bottom, but I cannot figure it out.

What I'm trying to achieve is something like:

var imagesFileBuffer = fs.readFileSync(__dirname + '/train-images-idx3-ubyte');
var labelFileBuffer  = fs.readFileSync(__dirname + '/train-labels-idx1-ubyte');
var pixelValues      = {};

Do magic

pixelValues are now like:

// {
//   "0": [0,0,200,190,79,0... for all 784 pixels ... ],
//   "4": [0,0,200,190,79,0... for all 784 pixels ... ],

etc for all image entries in the dataset. I've tried to figure out the structure of the binary files, but failed.


Solution

  • I realized there would be duplicate keys in my structure of the pixelValues object, so I made an array of objects of it instaed. The following code will create the structure I'm after:

    var dataFileBuffer  = fs.readFileSync(__dirname + '/train-images-idx3-ubyte');
    var labelFileBuffer = fs.readFileSync(__dirname + '/train-labels-idx1-ubyte');
    var pixelValues     = [];
    
    // It would be nice with a checker instead of a hard coded 60000 limit here
    for (var image = 0; image <= 59999; image++) { 
        var pixels = [];
    
        for (var x = 0; x <= 27; x++) {
            for (var y = 0; y <= 27; y++) {
                pixels.push(dataFileBuffer[(image * 28 * 28) + (x + (y * 28)) + 15]);
            }
        }
    
        var imageData  = {};
        imageData[JSON.stringify(labelFileBuffer[image + 8])] = pixels;
    
        pixelValues.push(imageData);
    }
    

    The structure of pixelValues is now something like this:

    [
        {5: [28,0,0,0,0,0,0,0,0,0...]},
        {0: [0,0,0,0,0,0,0,0,0,0...]},
        ...
    ]
    

    There are 28x28=784 pixel values, all varying from 0 to 255.

    To render the pixels, use my for loops like I did above, rendering the first pixel in the upper left corner, then working towards the right.