How to use RGB Image as input for the C# EvalDll Wrapper?

I trained a network using the provided ImageReader and now, I'm trying to use the CNTK EvalDll in a C# project to evaluate RGB Images.

I've seen examples related to the EvalDll, but the input is always an array of float/double, never images.

How can I use the exposed interface to use the trained network with an RGB image ?

Solution

I'll assume that you'll want the equivalent of reading with the ImageReader, where your reader config looks something like

features=[
        width=224
        height=224
        channels=3
        cropType=Center
]

You'll need helper functions to create the crop, and to re-size the image to the size accepted by the network.

I'll define 2 extension methods of System.Drawing.Bitmap, one to crop, and one to re-size:

open System.Collections.Generic
open System.Drawing
open System.Drawing.Drawing2D
open System.Drawing.Imaging
type Bitmap with
    /// Crops the image in the present object, starting at the given (column, row), and retaining
    /// the given number of columns and rows.
    member this.Crop(column, row, numCols, numRows) = 
        let rect = Rectangle(column, row, numCols, numRows)
        this.Clone(rect, this.PixelFormat)
    /// Creates a resized version of the present image. The returned image
    /// will have the given width and height. This may distort the aspect ratio
    /// of the image.
    member this.ResizeImage(width, height, useHighQuality) =
        // Rather than using image.GetThumbnailImage, use direct image resizing.
        // GetThumbnailImage throws odd out-of-memory exceptions on some 
        // images, see also 
        // http://stackoverflow.com/questions/27528057/c-sharp-out-of-memory-exception-in-getthumbnailimage-on-a-server
        // Use the interpolation method suggested on 
        // http://stackoverflow.com/questions/1922040/resize-an-image-c-sharp
        let rect = Rectangle(0, 0, width, height);
        let destImage = new Bitmap(width, height);
        destImage.SetResolution(this.HorizontalResolution, this.VerticalResolution);
        use graphics = Graphics.FromImage destImage
        graphics.CompositingMode <- CompositingMode.SourceCopy;
        if useHighQuality then
            graphics.InterpolationMode <- InterpolationMode.HighQualityBicubic
            graphics.CompositingQuality <- CompositingQuality.HighQuality
            graphics.SmoothingMode <- SmoothingMode.HighQuality
            graphics.PixelOffsetMode <- PixelOffsetMode.HighQuality
        else
            graphics.InterpolationMode <- InterpolationMode.Low
        use wrapMode = new ImageAttributes()
        wrapMode.SetWrapMode WrapMode.TileFlipXY
        graphics.DrawImage(this, rect, 0, 0, this.Width,this.Height, GraphicsUnit.Pixel, wrapMode)
        destImage

Based on that, define a function to do the center crop:

/// Returns a square sub-image from the center of the given image, with
/// a size that is cropRatio times the smallest image dimension. The 
/// aspect ratio is preserved.
let CenterCrop cropRatio (image: Bitmap) =
    let cropSize = 
        float(min image.Height image.Width) * cropRatio
        |> int
    let startRow = (image.Height - cropSize) / 2
    let startCol = (image.Width - cropSize) / 2
    image.Crop(startCol, startRow, cropSize, cropSize)

Then plug it all together: crop, resize, then traverse the image in the plane order that OpenCV uses:

/// Creates a list of CNTK feature values from a given bitmap.
/// The image is first resized to fit into an (targetSize x targetSize) bounding box,
/// then the image planes are converted to a CNTK tensor.
/// Returns a list with targetSize*targetSize*3 values.
let ImageToFeatures (image: Bitmap, targetSize) =
    // Apply the same image pre-processing that is typically done
    // in CNTK when running it in test or write mode: Take a center
    // crop of the image, then re-size it to the network input size.
    let cropped = CenterCrop 1.0 image
    let resized = cropped.ResizeImage(targetSize, targetSize, false)
    // Ensure that the initial capacity of the list is provided 
    // with the constructor. Creating the list via the default constructor
    // makes the whole operation 20% slower.
    let features = List (targetSize * targetSize * 3)
    // Traverse the image in the format that is used in OpenCV:
    // First the B plane, then the G plane, R plane
    for c in 0 .. 2 do
        for h in 0 .. (resized.Height - 1) do
            for w in 0 .. (resized.Width - 1) do
                let pixel = resized.GetPixel(w, h)
                let v = 
                    match c with 
                    | 0 -> pixel.B
                    | 1 -> pixel.G
                    | 2 -> pixel.R
                    | _ -> failwith "No such channel"
                    |> float32
                features.Add v
    features

Call ImageToFeatures with the image in question, feed the result into an instance of IEvaluateModelManagedF, and you're good. I'm assuming your RGB image comes in myImage, and you're doing binary classification with a network size of 224 x 224.

let LoadModelOnCpu modelPath =
    let model = new IEvaluateModelManagedF()
    let description = sprintf "deviceId=-1\r\nmodelPath=\"%s\"" modelPath
    model.Init description
    model.CreateNetwork description
    model
let model = LoadModelOnCpu("myModelFile")
let featureDict = Dictionary()
featureDict.["features"] <- ImageToFeatures(myImage, 224)
model.Evaluate(featureDict, "OutputNodes.z", 2)