I have a dataset of 200k+ color patches captured on two different mediums that I'm building a color transformation for. Initially I did a direct RGB-to-RGB input-output in the neural network. This works decently well, but I wanted to use a luminance-chrominance space to perform the match in to potentially better translate luminance and color contrast relationships. While I initially did it in CIELAB and YCbCr, the transformation of the dataset into either space is ultimately inaccurate as the data represents HDR scene data in a logarithmic container and neither space isn't built for HDR scene representation. So I'm attempting to use Dolby's ICtCP space which is built from unbounded scene linear information. I performed the transformation into the space and confirmed the output and array structure to be correct. However, upon feeding the variables into the network, it would immediatly start giving astronomical losses before flicking over to inf or nan loss. I can't figure out what the issue is.
I'm using the colour-science library for the internal color transforms and I've tested both a custom loss specifically for the ICtCP space and mse that's built into TF (to ensure this isn't a formatting issue). Both gave me extreme loss. I also printed the RGB and ICtCP values to text files to make sure there weren't any out of range values, but that was not the issue. RGB values are in a 0-1 range and ICtCp values are in a I (0:1), Ct (-1:1), Cp (-1:1) range.
My color transform functions in and out of ICtCP
#Davinci Wide Gamut Intermediate to Dolby ICtCP HDR opponent space
def DWG_TO_ITP(rgb_values):
cs = colour.models.RGB_COLOURSPACE_DAVINCI_WIDE_GAMUT
#DWG DI to XYZ Linear
xyzLin = colour.RGB_to_XYZ(rgb_values, cs.whitepoint, cs.whitepoint, cs.matrix_RGB_to_XYZ, cctf_decoding=cs.cctf_decoding)
#XYZ to ICtCp
ictcp = colour.XYZ_to_ICtCp(xyzLin)
return ictcp
# Dolby ICtCp HDR opponent space to Davinci Wide Gamut Intermediate
def ITP_TO_DWG(itp_values):
cs = colour.models.RGB_COLOURSPACE_DAVINCI_WIDE_GAMUT
#ICtCp to XYZ
xyzLin = colour.ICtCp_to_XYZ(itp_values)
#XYZ Linear to DWG DI
dwg = colour.XYZ_to_RGB(xyzLin, cs.whitepoint, cs.whitepoint, cs.matrix_XYZ_to_RGB, cctf_encoding=cs.cctf_encoding)
return dwg
Custom Loss (not currently active)
def ITP_loss(y_true, y_pred):
# Split the ICtCp values into I, T, and P components
I_1, T_1, P_1 = tf.split(y_true, 3, axis=-1)
I_2, T_2, P_2 = tf.split(y_pred, 3, axis=-1)
# Adjust the T components as in the original delta_E_ITP function
T_1 = T_1 * 0.5
T_2 = T_2 * 0.5
# Compute the squared differences
d_E_ITP = 720 * tf.sqrt(
tf.square(I_2 - I_1) +
tf.square(T_2 - T_1) +
tf.square(P_2 - P_1)
)
# Return the mean error as the loss
return tf.reduce_mean(d_E_ITP)
My neural network
def transform_nn(combined_rgb_values, output_callback, epochs=10000, batch_size=32):
source_rgb = np.vstack([rgb_pair[0] for rgb_pair in combined_rgb_values])
target_rgb = np.vstack([rgb_pair[1] for rgb_pair in combined_rgb_values])
source_itp = DWG_TO_ITP(source_rgb)
target_itp = DWG_TO_ITP(target_rgb)
# Neural network base model with L2 regularization
alpha = 0 # no penalty for now
model = keras.Sequential([
keras.layers.Input(shape=(3,)),
keras.layers.Dense(128, activation = 'gelu', kernel_regularizer = keras.regularizers.L2(alpha)),
keras.layers.Dense(64, activation = 'gelu', kernel_regularizer = keras.regularizers.L2(alpha)),
keras.layers.Dense(32, activation = 'gelu', kernel_regularizer = keras.regularizers.L2(alpha)),
keras.layers.Dense(3,)
])
# Model optimization with Adam
adam_optimizer = keras.optimizers.Adam(learning_rate=0.001)
model.compile(
optimizer= adam_optimizer,
loss= "mean_squared_error",
metrics=['mean_squared_error'])
#normal
early_stopping_norm = EarlyStopping(
monitor = 'val_loss',
patience = 30,
verbose=1,
restore_best_weights=True
)
# Train without early stopping
history = model.fit(x=source_itp, y=target_itp,
epochs=epochs, batch_size=batch_size,
verbose="auto", validation_split=0.3,
callbacks=[early_stopping_norm])
def interpolator(input_rgb):
input_itp = DWG_TO_ITP(input_rgb)
output_itp = model.predict(input_itp)
output_rgb = ITP_TO_DWG(output_itp)
return output_rgb
return interpolator
And lastly, the type of losses I'm getting. Note: this is the mean_squared_error loss native to the compiler, but similar extremes values come with the custom loss. I never ran into this issue with either the CIELAB or YCbCr implementation.
Epoch 1/10000
70/9078 [..............................] - ETA: 6s - loss: 19151210161612029119172287351962936121302040109299793920.0000 - mean_squared_error: 1915121016161202911917228735196293612130 150/9078 [..............................] - ETA: 6s - loss: 8937231408752302941104862160146934414914780835554000896.0000 - mean_squared_error: 89372314087523029411048621601469344149147 236/9078 [..............................] - ETA: 5s - loss: 8411239422438024050387858001836140461389620391983448064.0000 - mean_squared_error: 84112394224380240503878580018361404613896 322/9078 [>.............................] - ETA: 5s - loss: 55694874365583834449267799576553768559551931724848365789378071082067252634355826658906428848197067342214382161787617280.0000 407/9078 [>.............................] - ETA: 5s - loss: 9272320170949610945087897503565859983725183487173275717008470165482614622395441710684957926712521227412477496744314184602899 494/9078 [>.............................] - ETA: 5s - loss: inf - mean_squared_error: inf 9078/9078 [==============================] - 7s 686us/step - loss: nan - mean_squared_error: nan - val_loss: nan - val_mean_squared_error: nan
Epoch 2/10000
9078/9078 [==============================] - 6s 682us/step - loss: nan - mean_squared_error: nan - val_loss: nan - val_mean_squared_error: nan
Epoch 3/10000
8987/9078 [============================>.] - ETA: 0s - loss: nan - mean_squared_error: nan%
I suspect it's because ICtCp to XYZ can result in NaN
, for example (0.24, -0.42, 0.48)
. In this case S' ~= -0.15
and with PQ transfer function (EOTF) we try to raise that to the power (1 / 78.84375)
which isn't clear how to handle.