Detecting Malaria Cell in scan image

I use the malaria scan image to classify the image has malaria or not. The data set is downloaded from kaggle.

I achieve above 96% of accuracy.

Now, I wonder how to detect the cell in the scan image. I need to point out the malaria cell in the image or draw the outline for the malaria cell.

Sample Image which contains malaria cell

How can I achieve the detection in this problem?

Solution

If I assume you want to locate the dark purple in the image, then this is one way to do it using Python/OpenCV/Numpy/Sklearn.

Read the input image without the alpha channel to remove the text guides
Do kmeans color segmentation using 3 colors (to return: black, light purple, dark purple). I use Sklearn as it is a bit simpler for me. But you can do it with OpenCV also.
Do color image thresholding on the dark purple color
(If desired add some morphology, though I did not use it here)
Get all contours and also largest contour (separately)
Save the resulting images

Input:

import cv2
import numpy as np
from sklearn import cluster

# read image
image = cv2.imread("purple_cell.png")
h, w, c = image.shape

# convert image to float in range 0-1 for sklearn kmeans
img = image.astype(np.float64)/255.0

# reshape image to 1D
image_1d = img.reshape(h*w, c)

# compute kmeans for 3 colors
kmeans_cluster = cluster.KMeans(n_clusters=3)
kmeans_cluster.fit(image_1d)
cluster_centers = kmeans_cluster.cluster_centers_
cluster_labels = kmeans_cluster.labels_

# need to scale back to range 0-255
newimage = (255*cluster_centers[cluster_labels].reshape(h, w, c)).clip(0,255).astype(np.uint8)

# Set BGR color ranges
lowerBound = np.array([170,90,120]);
upperBound = np.array([195,110,140]);

# Compute mask (roi) from ranges in dst
thresh = cv2.inRange(newimage, lowerBound, upperBound);

# get largest contour and all contours
contours = cv2.findContours(thresh, cv2.RETR_TREE, cv2.CHAIN_APPROX_SIMPLE)
contours = contours[0] if len(contours) == 2 else contours[1]
area_thresh = 0
result1 = image.copy()
for c in contours:
    cv2.drawContours(result1, [c], -1, (0, 255, 0), 1)
    area = cv2.contourArea(c)
    if area > area_thresh:
        area_thresh=area
        big_contour = c

# draw largest contour only
result2 = image.copy()
cv2.drawContours(result2, [big_contour], -1, (0, 255, 0), 1)


cv2.imshow('image', image)
cv2.imshow('newimage', newimage)
cv2.imshow('thresh', thresh)
cv2.imshow('result1', result1)
cv2.imshow('result2', result2)
cv2.waitKey()

cv2.imwrite('purple_cell_kmeans_3.png', newimage)
cv2.imwrite('purple_cell_thresh.png', thresh)
cv2.imwrite('purple_cell_extracted1.png', result1)
cv2.imwrite('purple_cell_extracted2.png', result2)

Kmeans Image:

Thresholded Image:

All Contours Image:

Largest Contour Image: