Search code examples
f#.net-coreimagesharp

How to extract, upcast and process an array of bytes from GetPixelSpan then save back to a file?


This is probably a really simple matter, but I can't quite figure out how to put the pieces together. This question and this question as well as this page in the API documentation all somewhat hint at the answer, but I haven't been able to work out what I need from them.

So right now I'm trying to implement a naïve little program to open an image, get the pixels out into an array, process them a bit and then save the updated pixels back as a new image. In this particular case, I'm wanting to take the average on the 3x3 window around each pixel as a simple blur. The specific operation isn't too important (there definitely are more efficient ways, I'm specifically trying to write a naïve version right now for later comparison against other versions), but I haven't been able to work out how to make this happen. Right now what I have is:

let accessClampedArrayWithDefault (arr: uint32[][]) width height def x y : uint32[] =
    if x < 0 || x > width-1 || y < 0 || y > height-1 then
        def
    else
        arr.[x + width * y]

let extractPixelParts (p: Rgba32) =
    let R = uint32 p.R
    let G = uint32 p.G
    let B = uint32 p.B
    let A = uint32 p.A
    [|R; G; B; A|]

[<EntryPoint>]
let main argv =
    use img = Image.Load(@"D:\Users\sampleimage.jpg")    
    let mutable out_img = img.Clone()    
    let pxs = img.GetPixelSpan().ToArray() |> Array.map extractPixelParts    
    let mutable (nps: uint32[][]) = Array.zeroCreate pxs.Length    
    let ac = accessClampedArrayWithDefault pxs img.Width img.Height [|0u;0u;0u;0u|]

    for x in 0..img.Width-1 do
        for y in 0..img.Height-1 do
            let p = ac x y
            for z in -1..1 do
                for w in -1..1 do
                    let q = ac (x + z) (y + w)
                    nps.[x + y * img.Width] <- Array.zip p q |> Array.map (fun (a,b) -> a + b)
            nps.[x + y * img.Width] <- Array.map (fun i -> float i / 9.0 |> uint32 ) nps.[x + y * img.Width]

    let rpx = Array.collect (fun a -> Array.map byte a) nps

    let out_img = Image.Load<Rgba32>(img.GetConfiguration(), rpx, Formats.Jpeg.JpegDecoder())

    printfn "out_img's width is %d and height is %d" out_img.Width out_img.Height

but it is failing with an exception on the let out_img = line. If I don't include the JpegDecoder part then I get an error message about a missing decoder, but if I do include it then I get an error message about a missing SOI.

So, my question is, how can I pull out pixels and work with them/each channel in a larger variable size than 8 bits (e.g. 32 bits) so that I can perform intermediate operations that cannot be represented in 8 bits per channel, before converting the final result back to bytes, and then reconstituting that back to something that can be saved to disk as an image?

I have quite possibly forgotten to mention something important, so please do feel free to ask for clarifications :) Thanks.


Solution

  • I'm not familiar with F#, but looks like there are several issues:

    • The line Image.Load<Rgba32>(img.GetConfiguration(), rpx, Formats.Jpeg.JpegDecoder()) will try to decode a Jpeg-encoded in-memory stream (provided as byte[]).

    • Regarding your question:

      so that I can perform intermediate operations that cannot be represented in 8 bits per channel

    Why don't you just work on the Rgba32[]array? There is no need for the extractPixelParts ... stuff. Storing all your pixels in a jagged array (uint32[][]) will lead to a very slow code execution because of the unnecessary heap allocations.

    EDIT: Sorry, I misunderstood this point. If you need higher precison for intermediate operations, I suggest to use Vector4! you can use pixel.ToVector4() and pixel.PackFromVector4(...)

    My suggestion (still not optimized but probably easy to understand):

    1. Do not copy the image. Just create an Rgba32[] (!!!) array by let pxs = img.GetPixelSpan().ToArray()
    2. Process the values inside the array using the formula arr[y * Width + x] = CreateMyNewRgbaPixelValueAtXY(....) where CreateMyNewRgbaPixelValueAtXY(...) should return an Rgba32
    3. Return a new image by Image.LoadPixelData(pxs). The LoadPixelData method will create a new image by copying your pxs: Rgba32[] data into it.
    4. Dispose your original image!

    EDIT 2

    In order to perform intermediate operation in an efficient way, I suggest the following:

    • Create a inputPixelData:Vector4[] for your intermediate array filled by invoking pixel.ToVector4() for each input pixel
    • Create an other array outputPixelData:Vector4[] and fill it by processing inputPixelData
    • Pack outputPixelData back into an pixels:Rgba32[] array using .PackFromVector4(outputPixelData[y * Width + x]) (Don't know what's the best way for this in F#)
    • Image.LoadPixelData(pixels)

    There is probably a better way, but I'm unfamiliar with F#.