I have converted a Tensorflow model to Tensorflow JS and tried using in the browser. There are some preprocessing steps which are to be executed on the inout before feeding it to the model for inference. I have implemented these steps same as the Tensorflow. The problem is the inference results are not same on TF JS in comparison with Tensorflow. So I have started debugging the code and found that the results from the floating point arithmetic operations in the preprocessing on TF JS are different from the Tensorflow which is running on Docker container with GPU. Code used in the TF JS is below.
var tensor3d = tf.tensor3d(image,[height,width,1],'float32')
var pi= PI.toString();
if(bs == 14 && pi.indexOf('1') != -1 ) {
tensor3d = tensor3d.sub(-9798.6993999999995).div(7104.607118190255)
}
else if(bs == 12 && pi.indexOf('1') != -1) {
tensor3d = tensor3d.sub(-3384.9893000000002).div(1190.0708513300835)
}
else if(bs == 12 && pi.indexOf('2') != -1) {
tensor3d = tensor3d.sub(978.31200000000001).div(1092.2426342420442)
}
var resizedTensor = tensor3d.resizeNearestNeighbor([224,224]).toFloat()
var copiedTens = tf.tile(resizedTensor,[1,1,3])
return copiedTens.expandDims();
Python code blocks used
ds = pydicom.dcmread(input_filename, stop_before_pixels=True)
if (ds.BitsStored == 12) and '1' in ds.PhotometricInterpretation:
normalize_mean = -3384.9893000000002
normalize_std = 1190.0708513300835
elif (ds.BitsStored == 12) and '2' in ds.PhotometricInterpretation:
normalize_mean = 978.31200000000001
normalize_std = 1092.2426342420442
elif (ds.BitsStored == 14) and '1' in ds.PhotometricInterpretation:
normalize_mean = -9798.6993999999995
normalize_std = 7104.607118190255
else:
error_response = "Unable to read required metadata, or metadata invalid.
BitsStored: {}. PhotometricInterpretation: {}".format(ds.BitsStored,
ds.PhotometricInterpretation)
error_json = {'code': 500, 'message': error_response}
self._set_headers(500)
self.wfile.write(json.dumps(error_json).encode())
return
normalization = Normalization(mean=normalize_mean, std=normalize_std)
resize = ResizeImage()
copy_channels = CopyChannels()
inference_data_collection.append_preprocessor([normalization, resize,
copy_channels])
Normalization code
def normalize(self, normalize_numpy, mask_numpy=None):
normalize_numpy = normalize_numpy.astype(float)
if mask_numpy is not None:
mask = mask_numpy > 0
elif self.mask_zeros:
mask = np.nonzero(normalize_numpy)
else:
mask = None
if mask is None:
normalize_numpy = (normalize_numpy - self.mean) / self.std
else:
raise NotImplementedError
return normalize_numpy
ResizeImage code
from skimage.transform import resize
def Resize(self, data_group):
input_data = data_group.preprocessed_case
output_data = resize(input_data, self.output_dim)
data_group.preprocessed_case = output_data
self.output_data = output_data
CopyChannels code
def CopyChannels(self, data_group):
input_data = data_group.preprocessed_case
if self.new_channel_dim:
output_data = np.stack([input_data] * self.channel_multiplier, -1)
else:
output_data = np.tile(input_data, (1, 1, self.channel_multiplier))
data_group.preprocessed_case = output_data
self.output_data = output_data
Sample outoputs Left is Tensorflow on Docker with GPU and right is TF JS:
The results are actually different after every step.
There might be a number of possibilities that can lead to the issue.
1- The ops used in python are not used in the same manner in both js and python. If that is the case, using exactly the same ops will get rid of the issue.
2- The tensors image might be read differently by the python library and the browser canvas. Actually, accross browsers the canvas pixel don't always have the same value due to some operations like anti-aliasing, etc ... as explained in this answer. So there might be some slight differences in the result of the operations. To make sure that this is the root cause of the issue, first try to print the python and the js array image
and see if they are alike. It is likely that the 3d tensor is different in js and python.
tensor3d = tf.tensor3d(image,[height,width,1],'float32')
In this case, instead of reading directly the image in the browser, one can use the python library to convert image to array of tensor. And use tfjs to read directly this array instead of the image. That way, the input tensors will be the same both for in js and in python.
3 - it is a float32 precision issue. tensor3d is created with the dtype float32
and depending on the operations used, there might be a precision issue. Consider this operation:
tf.scalar(12045, 'int32').mul(tf.scalar(12045, 'int32')).print(); // 145082032 instead of 145082025
The same precision issue will be encountered in python with the following:
a = tf.constant([12045], dtype='float32') * tf.constant([12045], dtype='float32')
tf.print(a) // 145082032
In python this can be solved by using int32
dtype. However because of the webgl float32
limitation the same thing can't be done using the webgl backend on tfjs. In neural networks, this precision issue is not a great deal. To get rid of it, one can change the backend using setBackend('cpu')
for instance which is much slower.