Search code examples
tensorflowjavascript-objectsknntensorflow.js

Recreate Tensor From String


I am using tensorflow.js(node) as a way of preprocessing image files to tensors.

const tf = require('@tensorflow/tfjs');
require("@tensorflow/tfjs-node")
const mobilenetModule = require('@tensorflow-models/mobilenet');
const knnClassifier = require('@tensorflow-models/knn-classifier'); 
const { loadImage, createCanvas } = require('canvas')

When I create a classifier, it saves a Tensor class object as a key:value pair. After creating this object, I stringify it, and write it to a file so I can use it later.

{ '0':
   Tensor {
     kept: true,
     isDisposedInternal: false,
     shape: [ 5, 1024 ],
     dtype: 'float32',
     size: 5120,
     strides: [ 1024 ],
     dataId: {},
     id: 1333,
     rankType: '2',
     scopeId: 728 },
  '1':
   Tensor {
     kept: true,
     isDisposedInternal: false,
     shape: [ 5, 1024 ],
     dtype: 'float32',
     size: 5120,
     strides: [ 1024 ],
     dataId: {},
     id: 2394,
     rankType: '2',
     scopeId: 1356 } }
fs.writeFileSync("test", util.inspect(classifier.getClassifierDataset(), false, 2, false))

When I parse that string since it is not standard JSON, the .parse() method finds an error with the file

(node:14780) UnhandledPromiseRejectionWarning: SyntaxError: Unexpected token ' in JSON at position 2

How do I convert a string of this format back to an object of the same exact format?

EDIT:

Solved: Converting my tensor to an array

Saving that tensor as a string

Pulling that string from it stored location

Recreate Tensor

let tensorArr = tensor.arraySync()
fs.writeFileSync("test", JSON.stringify(tensorArr))
 let test = JSON.parse(classifierFile)
tf.tensor(test)

Going to recommend tensorflow-model KnnClassifier do this automatically with their .getClassifierDataset


Solution

  • It's is impossible to transform that string back into its original tensor. The reason for this is that the data does not contain the actual data of the tensor. This is only some meta data.

    Let's look at the first data you gave as an example:

    Tensor {
        kept: true,
        isDisposedInternal: false,
        shape: [ 5, 1024 ],
        dtype: 'float32',
        size: 5120,
        strides: [ 1024 ],
        dataId: {},
        id: 1333,
        rankType: '2',
        scopeId: 728
    }
    

    What I can say about the tensor is that it is of rank two and of the shape 5x1024. The total size is 5120 (so it has 5120 values associated with this tensor). The actual tensor data is however not present in this data.

    Another mistake is also that you used the util.inspect function, which should only be used for debugging purposes and to save data. To quote the docs:

    The util.inspect() method returns a string representation of object that is intended for debugging. The output of util.inspect may change at any time and should not be depended upon programmatically.

    You should use JSON.stringify instead.

    The correct way to do it

    Next time you want to save a tensor use the tensor.array() (or tensor.arraySync()) function.

    Example

    const t = tf.tensor2d([[1,2], [3,4]]);
    const dataArray = t.arraySync();
    const serializedString = JSON.stringify(dataArray);
    console.log(serializedString);
    

    This will return: [[1,2],[3,4]]

    To deserialize the data you can then use the tf.tensor function:

    const serializedString = '[[1,2],[3,4]]';
    const dataArray = JSON.parse(serializedString);
    const t = tf.tensor(dataArray);
    t.print();
    

    t is then the same tensor as above, the output will be:

    Tensor
        [[1, 2],
         [3, 4]]