Search code examples
machine-learningaugmented-realityarkitcoremlapple-vision

Object detection ARKit vs CoreML


I am building ARKit application for iPhone. I need to detect specific perfume bottle and display content depending on what is detected. I used demo app from developer.apple.com to scan real world object and export .arobject file which I can use in assets. It's working fine, although since bottle is from glass detection is very poor. It detects only in location where scan was made in range from 2-30 seconds or doesn't detect at all. Merging of scans doesn't improve situation, something making it even worse. Merged result may have weird orientation.

What can I do to solve this?

If nothing, will CoreML help me? I can make a lot of photos and teach model. What if I'll check each frame for match with this model? Does such approach have any chance?


Solution

  • About glass refraction

    Due to glass refraction phenomenon and different lighting conditions an object recognition process (in ARKit and CoreML) for perfume bottles is the most sophisticated one.

    Look at the following picture – there are three glass balls at different locations:

    enter image description here

    These glass balls have different Fresnel's IOR (Index Of Refraction), environment, camera's Point Of View, size and lighting conditions. But they have the same shape, material and colour.

    So, the best way to speed up a recognition process is to use identical background/environment (for example monochromatic light-grey paper BG), the same lighting condition (location, intensity, color, and direction of the light), good shape's readability (thanks to specular highlights) and the same POV for your camera.

    enter image description here

    I know, sometimes it's impossible to follow these tips but these ones are working.