Search code examples
tensorflowdeep-learningpytorchneural-network

Which deep learning library support the compression of the deep learning models to be used on the phones?


I want to build an advanced deep learning model (for example: a model that uses attention) and use it on android phones (without training of course) i will only use it for inference.
and i want a library that can do that and can compress the model size to be used on the phone or android.
and do you know any projects or apps similar to my goal?


Solution

  • There is a Caffe fork called Ristretto. It allows compressing neural nets for lower numerical precision (less than 32 bits per parameter), while keeping high accuracy. MXNet and Tensorflow also have this feature now. Pytorch doesn't have it yet. These tools allow to reduce the memory required for storing the neural net parameters, but they are not specific to Android.