Search code examples
javascripthtmlcanvashtml5-canvaspng

How to detect shape on a transparent canvas?


I'm looking for a method of detecting a shape in a transparent PNG. For example, I will create a transparent canvas of 940x680, then place a fully opaque object somewhere in that canvas.

I want to be able to detect the size (w, h), and top + left location of that object.

Here is an example of the original image:

Transparent PNG Canvas with Image Object

Here is an example of what I would like to achieve (Bounding box overlay, with top + left margin data):

Results Image

I've found a resource that does some transparency detection, but I'm not sure how I scale something like this to what I'm looking for.

var imgData,
    width = 200,
    height = 200;

$('#mask').bind('mousemove', function(ev){
    if(!imgData){ initCanvas(); }
    var imgPos = $(this).offset(),
      mousePos = {x : ev.pageX - imgPos.left, y : ev.pageY - imgPos.top},
      pixelPos = 4*(mousePos.x + height*mousePos.y),
         alpha = imgData.data[pixelPos+3];

    $('#opacity').text('Opacity = ' + ((100*alpha/255) << 0) + '%');
});

function initCanvas(){
    var canvas = $('<canvas width="'+width+'" height="'+height+'" />')[0],
           ctx = canvas.getContext('2d');

    ctx.drawImage($('#mask')[0], 0, 0);
    imgData = ctx.getImageData(0, 0, width, height);
}

Fiddle


Solution

  • What you need to do:

    • Get the buffer
    • Get a 32-bits reference of that buffer (If your other pixels are transparent then you can use a Uint32Array buffer to iterate).
    • Scan 0 - width to find x1 edge
    • Scan width - 0 to find x2 edge
    • Scan 0 - height to find y1 edge
    • Scan height - 0 to find y2 edge

    These scans can be combined but for simplicity I'll show each step separately.

    Online demo of this can be found here.

    Result:

    Snapshot

    When image is loaded draw it in (if the image is small then the rest of this example would be waste as you would know the coordinates when drawing it - assuming here the image you draw is large with a small image inside it)

    (note: this is a non-optimized version for the sake of simplicity)

    ctx.drawImage(this, 0, 0, w, h);
    
    var idata = ctx.getImageData(0, 0, w, h),      // get image data for canvas
        buffer = idata.data,                       // get buffer (unnes. step)
        buffer32 = new Uint32Array(buffer.buffer), // get a 32-bit representation
        x, y,                                      // iterators
        x1 = w, y1 = h, x2 = 0, y2 = 0;            // min/max values
    

    Then scan each edge. For left edge you scan from 0 to width for each line (non optimized):

    // get left edge
    for(y = 0; y < h; y++) {                       // line by line
        for(x = 0; x < w; x++) {                   // 0 to width
            if (buffer32[x + y * w] > 0) {         // non-transparent pixel?
                if (x < x1) x1 = x;                // if less than current min update
            }
        }
    }
    

    For the right edge you just reverse x iterator:

    // get right edge
    for(y = 0; y < h; y++) {                       // line by line
        for(x = w; x >= 0; x--) {                  // from width to 0
            if (buffer32[x + y * w] > 0) {
                if (x > x2) x2 = x;
            }
        }
    }
    

    And the same is for top and bottom edges just that the iterators are reversed:

    // get top edge
    for(x = 0; x < w; x++) {
        for(y = 0; y < h; y++) {
            if (buffer32[x + y * w] > 0) {
                if (y < y1) y1 = y;
            }
        }
    }
    
    // get bottom edge
    for(x = 0; x < w; x++) {
        for(y = h; y >= 0; y--) {
            if (buffer32[x + y * w] > 0) {
                if (y > y2) y2 = y;
            }
        }
    }
    

    The resulting region is then:

    ctx.strokeRect(x1, y1, x2-x1, y2-y1);
    

    There are various optimizations you could implement but they depend entirely on the scenario such as if you know approximate placement then you don't have to iterate all lines/columns.

    You could do a brute force guess of he placement by skipping x number of pixels and when you found a non-transparent pixel you could make a max search area based on that and so forth, but that is out of scope here.

    Hope this helps!