Search code examples
javascriptgoogle-cloud-vision

Object Localization - Google Vision API


I'm trying to re-create: This:
This

The issue is the data returned back from the API call to objectLocalization only contains normalized vectors, the vectors array is empty.

My code so far is exactly the same as the example:

const vision = require('@google-cloud/vision');

const client = new vision.ImageAnnotatorClient();

const imageURL = './Laptop_2D00_and_2D00_Tablet_2D00_Table_5F00_69BC66ED.jpg';

client
    .objectLocalization(imageURL)
    .then(results => {
        const objects = results[0].localizedObjectAnnotations;
        objects.forEach(object => {
            console.log(`Name: ${object.name}`);
            console.log(`Confidence: ${object.score}`);
            const veritices = object.boundingPoly.normalizedVertices;
            veritices.forEach(v => console.log(`x: ${v.x}, y:${v.y}`));
        });
    })
    .catch(err => {
        console.error('ERROR: ', err);
    });

And the results are:

Name: Laptop
Confidence: 0.8650877475738525
x: 0.004973180592060089, y:0.27008256316185
x: 0.18256860971450806, y:0.27008256316185
x: 0.18256860971450806, y:0.5250381827354431
x: 0.004973180592060089, y:0.5250381827354431
Name: Computer keyboard
Confidence: 0.732001006603241
x: 0.20447060465812683, y:0.6251764893531799
x: 0.5232940912246704, y:0.6251764893531799
x: 0.5232940912246704, y:0.9054117202758789
x: 0.20447060465812683, y:0.9054117202758789
Name: Person
Confidence: 0.6957111954689026
x: 0.9150910377502441, y:0.03288845717906952
x: 0.9932186007499695, y:0.03288845717906952
x: 0.9932186007499695, y:0.31247377395629883
x: 0.9150910377502441, y:0.31247377395629883
Name: Laptop
Confidence: 0.6388971209526062
x: 0.20340178906917572, y:0.3301794230937958
x: 0.4965982437133789, y:0.3301794230937958
x: 0.4965982437133789, y:0.9114677906036377
x: 0.20340178906917572, y:0.9114677906036377
Name: Table
Confidence: 0.5609536170959473
x: 0, y:0.11000002175569534
x: 0.998235285282135, y:0.11000002175569534
x: 0.998235285282135, y:0.9940000176429749
x: 0, y:0.9940000176429749
Name: Computer keyboard
Confidence: 0.5245768427848816
x: 0.012653245590627193, y:0.4093095660209656
x: 0.16077089309692383, y:0.4093095660209656
x: 0.16077089309692383, y:0.5089566707611084
x: 0.012653245590627193, y:0.5089566707611084

I want to them map those objects onto the image, but I have no idea how to do that with the data provided.


Solution

  • I've solved this.

    If anyone is interested, you can just multiply the normalized vector by the height of the image.

    Example of drawing the the bounding boxes back onto the image using html5 canvas :

        img.onload = () => {
          canvas.width = 512;
          canvas.height = 340.5;
    
          ctx.drawImage(img, 0, 0, 512, 340.5);
          for(let i = 0; i < this.state.imageData.length; i++){
            ctx.beginPath();
            const startingPos = this.state.imageData[i].boundingPoly.normalizedVertices[0];
            ctx.moveTo(startingPos.x * canvas.width, startingPos.y * canvas.height);
            for(let j = 1; j < this.state.imageData[i].boundingPoly.normalizedVertices.length; j++){
              let pos = this.state.imageData[i].boundingPoly.normalizedVertices[j];
              ctx.lineTo(pos.x * canvas.width, pos.y * canvas.height);
            }
            ctx.lineTo(startingPos.x * canvas.width, startingPos.y * canvas.height);
            ctx.strokeStyle = '#ff0000';
            ctx.stroke();
          }
        }