I have a tf.data.Dataset
of images with input shape (batch-size, 128, 128, 2) and target shape (batch-size, 128, 128, 1) where the inputs are 2-channel images (complex-valued images with two channels representing real and imaginary part) and the targets are 1-channel images (real-valued images).
I need to normalize the input and target images by first removing their mean image from them and then scaling them to (0,1) range. If I am not wrong, tf.data.Dataset
can work with only one batch at a time, not the entire dataset. So I am removing the mean image of the batch from each image in the batch in the remove_mean
py_function
and then scaling each image to (0,1) by subtracting its minimum value and dividing by the difference of its maximum and minimum values in py_function
linear_scaling
. But after printing min and max values in an input image from the dataset before and after applying the functions, there is no change in the image values.
Could anyone suggest what may be going wrong in this?
def remove_mean(image, target):
image_mean = np.mean(image, axis=0)
target_mean = np.mean(target, axis=0)
image = image - image_mean
target = target - target_mean
return image, target
def linear_scaling(image, target):
image_min = np.ndarray.min(image, axis=(1,2), keepdims=True)
image_max = np.ndarray.max(image, axis=(1,2), keepdims=True)
image = (image-image_min)/(image_max-image_min)
target_min = np.ndarray.min(target, axis=(1,2), keepdims=True)
target_max = np.ndarray.max(target, axis=(1,2), keepdims=True)
target = (target-target_min)/(target_max-target_min)
return image, target
a, b = next(iter(train_dataset))
print(tf.math.reduce_min(a[0,:,:,:]))
train_dataset.map(lambda item1, item2: tuple(tf.py_function(remove_mean, [item1, item2], [tf.float32, tf.float32])))
test_dataset.map(lambda item1, item2: tuple(tf.py_function(remove_mean, [item1, item2], [tf.float32, tf.float32])))
a, b = next(iter(train_dataset))
print(tf.math.reduce_min(a[0,:,:,:]))
train_dataset.map(lambda item1, item2: tuple(tf.py_function(linear_scaling, [item1, item2], [tf.float32])))
test_dataset.map(lambda item1, item2: tuple(tf.py_function(linear_scaling, [item1, item2], [tf.float32])))
a, b = next(iter(train_dataset))
print(tf.math.reduce_min(a[0,:,:,:]))
Output -
tf.Tensor(-0.00040511801, shape=(), dtype=float32)
tf.Tensor(-0.00040511801, shape=(), dtype=float32)
tf.Tensor(-0.00040511801, shape=(), dtype=float32)
map
is not an inplace operation, so your train_dataset
doesn't change when you do train_dataset.map(....)
.
Do train_dataset = train_dataset.map(...)