Search code examples
unity-game-enginearkitarcorevuforiarealitykit

Are there any limitations in Vuforia compared to ARCore and ARKit?


I am a beginner in the field of augmented reality, working on applications that create plans of buildings (floor plan, room plan, etc with accurate measurements) using a smartphone. So I am researching about the best AR SDK which can be used for this. There are not many articles pitting Vuforia against ARCore and ARKit.

Please suggest the best SDK to use, pros and cons of each.


Solution

  • Updated: February 02, 2024.

    TL;DR


    Foreword

    Before answering your question, I would like to point out that any AR framework will greatly benefit from having many scene understanding features and many types of anchors. The abundance of different types of anchors will allow you not only to securely tether 3D models according to a certain scenario, but even use a real human being or human hand as a starting point when measuring a distance (I mean iOS ARBodyAnchor and visionOS HandAnchor). Also, any framework's indisputable advantage is the availability of a high quality 32-bit depth data for Scene Reconstruction and occlusion. In fact, almost any new feature of the AR framework is a contribution to the quality of the AR experience. The same can be said about the hardware part – Apple's M-, R- and U-series chips that enhance the quality of the AR experience. Apple recently presented the Vision Pro headset (or, in other words, a "successor" of a discontinued MS HoloLens headset), which allows you to interact with the AR scene using hand gestures, eyes gazing, and voice commands.

    We live in the era of spatial computing.



    What is what

    Google ARCore allows you build apps for Android and iOS. With Apple ARKit and RealityKit you can build apps for visionOS and iOS. PTC Vuforia was designed to create apps for Android, iOS and Universal Windows Platform.

    A crucial Vuforia's peculiarity is that it uses ARCore/ARKit technologies (also known as platform hard'n'soft enablers) if the hardware it's running on supports them. Otherwise, Vuforia uses its own AR technology and engine, known as software solution without dependent hardware. In practice, an AR experience created using the Vuforia Engine will attempt to use the top most technologies and work theirs way downward dependent on what's available on the device during runtime.

    When developing for Android OEM smartphones, you may encounter an unpleasant issue: devices from different manufacturers need a sensors’ calibration in order to observe the same AR experience. Luckily, Apple devices have no such drawback because all sensors used there were calibrated under identical conditions.

    Let me put first things first.



    enter image description here enter image description here

    Google ARCore

    ARCore was released in March 2018. ARCore is based on the three main fundamental concepts : Motion Tracking, Environmental Understanding and Light Estimation. ARCore allows a supported mobile device to track its position and orientation relative to the world in 6 degrees of freedom (6DoF) using special technique called Concurrent Odometry and Mapping. COM helps us detect the size and location of horizontal, vertical and angled tracked surfaces. Motion Tracking works robustly thanks to optical data coming from a RGB camera at 60 fps, combined with inertial data coming from gyroscope and accelerometer at 1000 fps, and depth data coming from ToF sensor at 60 fps. Surely, ARKit, Vuforia and other AR libraries operate almost the same way.

    

When you move your phone through the real environment, ARCore tracks a surrounding space to understand where a smartphone is, relative to the world coordinates. At tracking stage, ARCore "sows" so called feature points. These feature points are visible through RGB camera, and ARCore uses them to compute phone's location change. The visual data then must be combined with measurements from IMU (Inertial Measurement Unit) to estimate the position and orientation of the ArCamera over time. If a phone isn't equipped with ToF sensor, ARCore looks for clusters of feature points that appear to lie on horizontal, vertical or angled surfaces and makes these surfaces available to your app as planes (we call this technique Plane Detection). After detection process you can use these planes to place 3D objects in your scene. Virtual geometry with assigned shaders will be rendered by ARCore's companion – Sceneform supporting a real-time Physically Based Rendering (a.k.a. PBR) engine – Filament.

    Notwithstanding the above, at this moment Sceneform repository has been archived and it no longer actively maintaining by Google. The last released official version was Sceneform 1.17.1. That may sound strange but ARCore team member said "there's no direct replacement for Sceneform library and ARCore developers are free to use any 3D game library with Android AR apps. However, there's an unofficial Sceneform + SceneView fork, so it's the continuation of the archived Sceneform framework (the last release is Sceneform 1.23).

    
ARCore's environmental understanding lets you place 3D objects with a correct depth occlusion in a way that realistically integrates with the real world. For example, you can place a virtual cup of coffee on the table using Depth hit-testing and ArAnchors.

    
ARCore can also define lighting parameters of a real environment and provide you with the average intensity and color correction of a given camera image. This data lets you light your virtual scene under the same conditions as the environment around you, considerably increasing the sense of realism.



    Current ARCore version has such a significant APIs as Raw Depth API and Full Depth API, Geospatial API, Lighting Estimation, Scene Semantics API, Vulkan rendering (in addition to OpenGL), Terrain Anchor API, Electronic Image Stabilization, Augmented Faces, Augmented Images, Instant Placement, 365-days Cloud Anchors, Recording and Playback and Multiplayer support. The valuable addition to the ARCore toolkit is Android Emulator allowing you run and debug AR apps using virtual device.



    In ARCore 1.31, the Google engineers mapped each shade of gray in the 16-bit depth channel to a distance of 1 mm. Thus, they managed to cover a distance of 65,536 millimeters (2^16). This table presents the difference between Raw Depth API and Full Depth API:

    Full Depth API (v1.31+) Raw Depth API (v1.24+) Full Depth API (v1.18+)
    Accuracy Bad Good Bad
    Coverage All pixels Not all pixels All pixels
    Distance 0 to 65.5 m 0.5 to 5.0 m 0 to 8.2 m


    ARCore is older than ARKit. Do you remember Project Tango released in 2014? Roughly speaking, ARCore is just a rewritten Tango SDK. But a wise acquisition of FlyBy Media, Faceshift, MetaIO, Camerai and Vrvana helped Apple not only to catch up but significantly overtake Google. Suppose that competition is always good for AR industry.

    The latest version of ARCore supports OpenGL ES acceleration, and integrates with Unity, Unreal, and Web applications. At the moment the most powerful and energy efficient chipsets for AR experience on Android platform are MediaTek Dimensity 9300 (4 nm), Snapdragon 8 Gen 3 (4 nm), Exynos 2400 (4 nm) and Google Tensor G3 (4 nm).

    Platform-specific directions: Android (Kotlin/Java), Android NDK (C) and Unity (AR Foundation).

    ARCore price: FREE.

    ARCore pros ARCore cons
    iToF and Depth API support AR Headset is still in development
    Quick Plane Detection Cloud Anchors hosted online
    Long-distance-accuracy Lack of native rendering engines
    ARCore Emulator in Android Studio No external camera support
    Geospatial anchoring Quickly drains phone's battery




    enter image description here enter image description here

    Apple ARKit

    ARKit was released in June 2017. Like its competitors, ARKit also uses special technique for tracking, but its name is Visual Inertial Odometry. VIO is used to very accurately track the world around your device. VIO is quite similar to COM found in ARCore. There are also similar fundamental concepts in ARKit: World Tracking, Scene Understanding (which includes four stages: Plane Detection, Ray-Casting, Light Estimation, Scene Reconstruction), and Rendering with a great help of ARKit companions – SceneKit framework, that’s actually an Apple 3D game engine since 2012, RealityKit framework specially made for AR and written in Swift from scratch (released in 2019), and SpriteKit framework with its 2D engine (since 2013).

    VIO fuses RGB sensor data at 60 fps with Core-Motion data (IMU) at 1000 fps and LiDAR data. In addition to that, It should be noted that due to a very high energy impact (because of an enormous burden on CPU and GPU), your iPhone's battery will be drained pretty quickly. The same can be said about Android devices.

    ARKit has a handful of useful approaches for robust tracking and accurate measurements. Among its arsenal you can find easy-to-use functionality for saving and retrieving ARWorldMaps. World map is an indispensable "portal" for Persistent and Multiuser AR experience that allows you to come back to the same environment filled with the same chosen 3D content just before the moment your app became inactive. Support for simultaneous front and back camera capture and support for collaborative sessions, is also great.

    There are good news for gamers: up to 6 people are simultaneously able to play the same AR game, thanks to MultipeerConnectivity framework. For 3D geometry you could use a brand-new USDZ file format, developed and supported by Pixar. USDZ is a good choice for sophisticated 3D models with multiple PBR shaders, physics, animations and spatial sound. Also you can use the following 3D formats for ARKit.

    ARKit can also help you perform People and Objects Occlusion technique (based on alpha and depth channels' segmentation), LiDAR Scene Reconstruction, Body Motion Capture tracking, Vertical and Horizontal Planes detection, Image detection, 3D Object detection, 3D Object scanning, 4K HDR video capture and RoomPlan Scanning powered by ARKit. With People and Objects Occlusion tool your AR content realistically passes behind and in front of real world entities, making AR experiences even more immersive. Realistic reflections, that use machine learning algorithms, and Face tracking experience allowing you to track up to 3 faces at a time, are also available for you.



    Using ARKit and iBeacons, you assist an iBeacon-aware application to know what room it’s in, and show a right 3D content chosen for that room. Working with ARKit you should intensively exploit ARAnchor class and all its subclasses.


    For creating ARKit apps you need macOS Sonoma, Xcode 15 and device running iOS 17+ or visionOS 1.0+. ARKit is a worthy candidate to marry Metal framework for GPU acceleration. Don’t forget that ARKit tightly integrates with Unity. At the moment the most powerful and energy efficient chipsets for AR experience are Apple M2 (5 nm) and A17 Bionic (3 nm).

    ARKit price: FREE.

    ARKit pros ARKit cons
    LiDAR and Depth API support OS and Chipsets' Restrictions
    visionOS Simulator in Xcode No auto-update for ARAnchors
    WorldMaps, AirTags and iBeacon awareness No iOS Simulator in Xcode
    Vision Pro headset support No external camera support
    Geospatial anchoring Quickly drains phone's or headset's battery




    enter image description here enter image description here

    Apple visionOS + RealityKit

    RealityKit and visionOS is a door to the world of spatial computing – the next level after the level of mobile devices. RealityKit was introduced at WWDC 2019. It is a high-level framework for developing visionOS (introduced in 2023), iOS and macOS apps. It supports a contemporary Entity-Component-System paradigm that allows you to more efficiently implement Non-AR and AR experiences. RealityKit's AR capabilities (such as the tracking system and scene understanding awareness) are entirely based on ARKit. In addition to the above, in visionOS, RealityKit acts as a system renderer. Frankly speaking, there is no need to list all the features of RealityKit here, as you can read about them in this SO post.

    The measurement quality is considered high when both HQ hardware and software parts correlate with each other. That's why using the Vision Pro headset running visionOS and RealityKit-based software will give you the expected level of quality of measurements in the large rooms. The Vision Pro headset has 12 cameras and 5 sensors to perfectly track the environment and interact with 3D models using hand gestures, controllers and voice commands. The combination of RealityKit with the NearbyInteraction API and recently updated CoreLocation framework, give you a valued toolkit for accurate measurements.

    In visionOS, there are two different approaches when you use gestures. People can interact with app's content primarily using their eyes and hands. When applying an indirect gesture, a person looks at a model, and then selects it by tapping an index finger to their thumb, for example (it's one of a dozen possible hand gestures). In case of a direct gesture, the user’s finger directly interacts with the model in 3D space. By the way, visionOS recognizes and tracks 26 skeletal joints on each hand in real time.

    enter image description here


    Pay attention to RealityKit's satellite – Reality Composer Pro app that's a part of Xcode 15+. Its intuitive UI is good for a quick AR/VR scenes' prototyping. Scenes built in Reality Composer Pro can be packed with anchors and models with custom components and ILM MaterialX shaders. Reality Composer Pro has a royalty free library with downloadable 3D assets (USDAs, sounds and ShaderGraph compositions). You can export your scene as a usdz file or a lightweight .reality file for AR Quick Look experience.

    RealityKit price: FREE.

    RealityKit pros RealityKit cons
    visionOS Simulator in Xcode Intensive usage of CPU/GPU
    Vision Pro headset support iOS 13+, macOS 10.15+, visionOS 1.0+ only
    Multithreaded rendering No iOS Simulator in Xcode
    Support for Reality Composer Pro scenes Quickly drains phone's or headset's battery
    Auto-updating tracking target Some RK components work only in visionOS




    descr descr

    PTC Vuforia

    In October 2015 PTC acquired Vuforia from Qualcomm for $65 million. Take into consideration that Qualcomm launched Vuforia in 2010. So Vuforia is an older sister in AR family. Big sister is watching you, guys! ;)

    In November 2016 Unity Technologies and PTC announced a strategic collaboration to simplify AR development. Since then they work together integrating new features of the Vuforia AR platform into the Unity game engine. Vuforia can be used with such development environments as Unity, MS Visual Studio, Apple Xcode and Android Studio. It supports a wide range of smartphones, tablets and AR smart glasses, such as HoloLens 2 and Magic Leap 2.

    Vuforia Engine's Visual-Inertial Simultaneous Localization And Mapping, or VISLAM, is an algorithm that implements a markerless AR experience. VISLAM combines the benefits of Visual-Inertial Odometry (VIO) and Simultaneous Localization And Mapping (SLAM).

    Vuforia Engine boasts roughly the same principal capabilities that you can find in the latest version of ARKit but also it has its own tools, such as Model Targets with Deep Learning and External Camera support for iOS, new experimental APIs for ARCore, and support for industry latest AR glasses. The main advantage of Vuforia over ARKit and ARCore that it has a wider list of supported devices and it supports the development of Universal Windows Platform apps for Intel-based Windows devices, including Microsoft Surface and HoloLens. Vuforia has a standalone version and a version baked directly into Unity.

    enter image description here

    Vuforia Fusion

    Vuforia Fusion is a set of technologies designed to provide the best possible AR experience on a wide range of devices. It was designed to solve the problem of fragmentation in AR enabling technologies such as cameras, sensors, chipsets, and software frameworks. With Vuforia Fusion, your app will automatically provide the best experience possible with no extra work required on your end.

    Vuforia Fusion has the following functionality:

    • Advanced Model Targets 360 | recognition powered by AI.
    • Larger Area Targets without a prior | to spaces of approximately 4,000 m².
    • Model Targets with ML | allow to instantly recognize objects by shape.
    • Barcode Scanner | an API for reading QR codes and barcodes.
    • Model Target Runtime 3D Guide Views | to create guide views in Unity at runtime.
    • Model Target Web API | generates Model Targets using the Web API.
    • Image Targets | the easiest way to put AR content on flat objects.
    • Multi Targets | for objects with flat surfaces and multiple sides.
    • Cylinder Targets | for placing AR content on objects with cylindrical shapes.
    • Ground Plane | enables content to be placed on floors and tabletop surfaces.
    • VuMarks | allows identify and add content to series of objects.
    • Object Targets | for scanning an object.
    • Static and Adaptive Modes | for stationary and moving objects.
    • Simulation Play Mode | allows to “walk through” or around the 3D model.
    • AR Session Recorder | can record AR experiences in the location.
    • and, of course, Vuforia Engine Area Targets.

    Vuforia Engine Area Targets enable developers to use an entire space, be it a factory floor or retail store, as AR target. Using a supported device, like Matterport Pro2 or Leica BLK360 3D cameras, developers can create a detailed 3D scan of a desired location. Once the scan produces a 3D model it can be converted into an Area Target with the Vuforia Area Target Generator. This target can then be brought into Unity.

    Occlusion Management is one of the key features for building a realistic AR experience. When you're using Occlusion Management, Vuforia Engine detects and tracks targets, even when they’re partially hidden behind everyday barriers, like your hand. Special occlusion handling allows apps to display graphics as if they appear inside physical objects.

    Vuforia API allows for a Static or Adaptive mode. When the real-world model remains stationary, like a large industrial machine, implementing the Static API will use significantly less processing power. This enables a longer lasting and higher performance experience for those models. For objects that won’t be stationary the Adaptive API allows for a continued robust experience.

    Vuforia supports Metal acceleration for iOS devices. Also you can use Vuforia Samples for your projects. For example: the Vuforia Core Samples library includes various scenes using Vuforia features, including a pre-configured Object Recognition scene that you can use as a reference and starting point for Object Recognition application.

    Here are Pros and Cons.

    Vuforia pros Vuforia cons
    Supports Android, iOS, UWP The price is not reasonable
    A lot of supported devices Poor developer documentation
    External Camera support PTC's mixing business with politics
    Webcam/Simulator Play Mode Doesn't support Geo tracking
    Cylinder Targets support Poor potential in Unity




    descr

    CONCLUSION :

    There are no vital limitations when developing with PTC Vuforia compared to ARCore and ARKit. Vuforia is an old great product and it supports a wider list of Apple and Android devices (even those that are not officially supported) and it supports several latest models of smart glasses and AR headsets.

    But in my opinion, ARKit, RoomPlan and Reality Family toolkit (RealityKit, AR Quick Look, Reality Composer Pro, iOS Reality Composer, and Reality Converter) have an extra bunch of useful up-to-date features that Vuforia and ARCore just partially have. ARKit and RealityKit personally have a better short-distance measurement accuracy within a room than any ARCore compatible device has, without any need for calibration. This is achieved thanks to Vision Pro hardware, or LiDAR dToF scanner in some iOS devices. ARCore, in its turn, uses iToF cameras with Raw Depth API or Depth from Motion algorithm. Both iToF and LiDAR allow you create a high-quality virtual mesh with OcclusionMaterial for real-world surfaces at scene understanding stage. This mesh is ready-for-measurement and ready-for-collision. With iToF and dToF sensors, frameworks instantly detect non-planar surfaces and surfaces with no-tracking-features-at-all, such as texture-free white walls in a poorly-lit rooms.

    If you implement iBeacon tools, ARWorldMaps and support for GPS – it will help you boost a tracking quality and eliminate many tracking errors accumulated over time. And ARKit's tight integration with Vision and CoreML frameworks makes a huge contribution to a robustness of AR toolset. Integration with Apple Maps allows ARKit put GPS Location Anchors outdoors with a highest possible precision at the moment. ARCore also uses Geospatial anchors that obtain geo-data from Google Earth and Street View images, created with the help of Google Trekker.

    Vuforia's measurement accuracy is highly dependent on what platform you're developing for. Some of Vuforia features are built on top of the tracking engine (ARKit or ARCore). Even popular Vuforia Chalk application uses ARKit positional tracker. However, let's see what hardware and technologies Apple, Google, PTC and Microsoft will offer us in the near future, when each of these companies will have its own developed ecosystem in the field of spatial computing.