Search code examples
androidfirebaseopencvgoogle-cloud-vision

Firebase Ml kit, Google cloud vision API or openCV


I want to build an android app for gaze tracking and I would like to ask which of the following tools I should use for better results.

  • Google Cloud Vision API
  • OpenCV (ex HaarCascade classifier)
  • Firebase ML kit with facial landmarks

Solution

  • I don't know if you plan to create a commercial application or if it's for research purposes, the things to consider change a bit in these two scenarios.

    For object tracking I'd problably go with google's mlkit, it has some ready-to-use models that also works offline, it also simplifies all the hard work of pure tensorflow (even on iOS) if you want to use your custom models. So your hard work will be to create an efficient model and not running it.

    Google Cloud Vision API I've not used yet, just the GCP machines to train a NN and they came in handy for it.

    OpenCV is a good one but might be hard to implement and mantain after, your app size will also considerably increase. I've used HaarCascade in my final paper 2 years ago, the work was hard and the result not that accurate, today I'd check the OpenCV's DNN module and go with Yolo like here. To summarize, I'd just recomment it if you have some specific image processing demand, but first check the Android's ColorFilter or ImageFilterView. If you choose to use OpenCV, I'd recommend you to compile it by yourself with cmake like described here just with the modules you need to use, so you app size won't increase that much.

    There's also some other options like Dlib or PyTorch, I've been working with dlib's SVM with a custom model last year, its results were good but it's slow to run, about 3~4 seconds, compared to a NN with tensorflow that runs in 50~60 milliseconds (even faster with quantized models). I don't have experience with PyTorch or other framework to share something with you.