I am applying thresholding on a text-digit based image. Using skimage.filters.try_all_threshold
results in 7 of thresholding algorithms getting applied. I am able to get the resut but I am thinking on how I can choose only 1 result to pass the result to next process/dynamically choose 1 best result.
You need to define a measure of similarity between the original image and the binarized images, and then select the thresholding method that maximizes that measure.
The following code simply aims at putting you on the right track. Notice that the function similarity
returns a random number rather than a sensible similarity measure. You should implement it on your own or replace it by an appropriate function.
import numpy as np
from skimage.data import text
import skimage.filters
import matplotlib.pyplot as plt
threshold_methods = [skimage.filters.threshold_otsu,
skimage.filters.threshold_yen,
skimage.filters.threshold_isodata,
skimage.filters.threshold_li,
skimage.filters.threshold_mean,
skimage.filters.threshold_minimum,
skimage.filters.threshold_mean,
skimage.filters.threshold_triangle,
]
def similarity(img, threshold_method):
"""Similarity measure between the original image img and and the
result of applying threshold_method to it.
"""
return np.random.random()
results = np.asarray([similarity(text(), f) for f in threshold_methods])
best_index = np.nonzero(results == results.min())[0][0]
best_method = thresholding_methods[best_index]
threshold = best_method(text())
binary = text() >= threshold
fig, ax = plt.subplots(1, 1)
ax.imshow(binary, cmap=plt.cm.gray)
ax.axis('off')
ax.set_title(best_method.__name__)
plt.show(fig)
Obviously, it makes nonsense to choose the thresholding method randomly (as I did in the toy example above). Instead, you should implement a similarity measure which allows you to automatically select the most efficient algorithm. One possible way to do so would consist in computing the misclassification error, i.e. the percentage of background pixels wrongly assigned to foreground, and conversely, foreground pixels wrongly assigned to background. As the misclassification error is a disimilarity measure rather than a similarity measure, you have to select the method that minimizes that measure like this:
best_index = np.nonzero(results == results.min())[0][0]
Take a look at this paper for details on this and other approaches to thresholding performance assessment.