Search code examples
cntk

How to use RGB Image as input for the C# EvalDll Wrapper?


I trained a network using the provided ImageReader and now, I'm trying to use the CNTK EvalDll in a C# project to evaluate RGB Images.

I've seen examples related to the EvalDll, but the input is always an array of float/double, never images.

How can I use the exposed interface to use the trained network with an RGB image ?


Solution

  • I'll assume that you'll want the equivalent of reading with the ImageReader, where your reader config looks something like

    features=[
            width=224
            height=224
            channels=3
            cropType=Center
    ]
    

    You'll need helper functions to create the crop, and to re-size the image to the size accepted by the network.

    I'll define 2 extension methods of System.Drawing.Bitmap, one to crop, and one to re-size:

    open System.Collections.Generic
    open System.Drawing
    open System.Drawing.Drawing2D
    open System.Drawing.Imaging
    type Bitmap with
        /// Crops the image in the present object, starting at the given (column, row), and retaining
        /// the given number of columns and rows.
        member this.Crop(column, row, numCols, numRows) = 
            let rect = Rectangle(column, row, numCols, numRows)
            this.Clone(rect, this.PixelFormat)
        /// Creates a resized version of the present image. The returned image
        /// will have the given width and height. This may distort the aspect ratio
        /// of the image.
        member this.ResizeImage(width, height, useHighQuality) =
            // Rather than using image.GetThumbnailImage, use direct image resizing.
            // GetThumbnailImage throws odd out-of-memory exceptions on some 
            // images, see also 
            // http://stackoverflow.com/questions/27528057/c-sharp-out-of-memory-exception-in-getthumbnailimage-on-a-server
            // Use the interpolation method suggested on 
            // http://stackoverflow.com/questions/1922040/resize-an-image-c-sharp
            let rect = Rectangle(0, 0, width, height);
            let destImage = new Bitmap(width, height);
            destImage.SetResolution(this.HorizontalResolution, this.VerticalResolution);
            use graphics = Graphics.FromImage destImage
            graphics.CompositingMode <- CompositingMode.SourceCopy;
            if useHighQuality then
                graphics.InterpolationMode <- InterpolationMode.HighQualityBicubic
                graphics.CompositingQuality <- CompositingQuality.HighQuality
                graphics.SmoothingMode <- SmoothingMode.HighQuality
                graphics.PixelOffsetMode <- PixelOffsetMode.HighQuality
            else
                graphics.InterpolationMode <- InterpolationMode.Low
            use wrapMode = new ImageAttributes()
            wrapMode.SetWrapMode WrapMode.TileFlipXY
            graphics.DrawImage(this, rect, 0, 0, this.Width,this.Height, GraphicsUnit.Pixel, wrapMode)
            destImage
    

    Based on that, define a function to do the center crop:

    /// Returns a square sub-image from the center of the given image, with
    /// a size that is cropRatio times the smallest image dimension. The 
    /// aspect ratio is preserved.
    let CenterCrop cropRatio (image: Bitmap) =
        let cropSize = 
            float(min image.Height image.Width) * cropRatio
            |> int
        let startRow = (image.Height - cropSize) / 2
        let startCol = (image.Width - cropSize) / 2
        image.Crop(startCol, startRow, cropSize, cropSize)
    

    Then plug it all together: crop, resize, then traverse the image in the plane order that OpenCV uses:

    /// Creates a list of CNTK feature values from a given bitmap.
    /// The image is first resized to fit into an (targetSize x targetSize) bounding box,
    /// then the image planes are converted to a CNTK tensor.
    /// Returns a list with targetSize*targetSize*3 values.
    let ImageToFeatures (image: Bitmap, targetSize) =
        // Apply the same image pre-processing that is typically done
        // in CNTK when running it in test or write mode: Take a center
        // crop of the image, then re-size it to the network input size.
        let cropped = CenterCrop 1.0 image
        let resized = cropped.ResizeImage(targetSize, targetSize, false)
        // Ensure that the initial capacity of the list is provided 
        // with the constructor. Creating the list via the default constructor
        // makes the whole operation 20% slower.
        let features = List (targetSize * targetSize * 3)
        // Traverse the image in the format that is used in OpenCV:
        // First the B plane, then the G plane, R plane
        for c in 0 .. 2 do
            for h in 0 .. (resized.Height - 1) do
                for w in 0 .. (resized.Width - 1) do
                    let pixel = resized.GetPixel(w, h)
                    let v = 
                        match c with 
                        | 0 -> pixel.B
                        | 1 -> pixel.G
                        | 2 -> pixel.R
                        | _ -> failwith "No such channel"
                        |> float32
                    features.Add v
        features
    

    Call ImageToFeatures with the image in question, feed the result into an instance of IEvaluateModelManagedF, and you're good. I'm assuming your RGB image comes in myImage, and you're doing binary classification with a network size of 224 x 224.

    let LoadModelOnCpu modelPath =
        let model = new IEvaluateModelManagedF()
        let description = sprintf "deviceId=-1\r\nmodelPath=\"%s\"" modelPath
        model.Init description
        model.CreateNetwork description
        model
    let model = LoadModelOnCpu("myModelFile")
    let featureDict = Dictionary()
    featureDict.["features"] <- ImageToFeatures(myImage, 224)
    model.Evaluate(featureDict, "OutputNodes.z", 2)