I am putting a model into production and I am required to scan all dependencies (Pytorch and Numpy) beforehand via VeraCode Scan.
I noticed that the majority of the flaws are coming from test scripts and caffe2 modules in Pytorch and numpy.
Is there any way to build/install only part of these packages that I use in my application? (e.g. I won't use testing and caffe2 in the application so there's no need to have them in my PyTorch / Numpy source code)
You could package your application using pyinstaller
. This tool packages your app with Python and dependencies and use only the parts you need (simplifying, in reality it's hard to trace your package exactly so some other stuff would be bundled as well).
Also you might be in for some quirks and workarounds to make it work with pytorch
and numpy
as those dependencies are quite heavy (especially pytorch
).
numpy
and pytorch
are pretty similar feature-wise (as PyTorch tries to be compatible with it) hence maybe you could only use only of them which would simplify the whole thing further
Depending on other parts of your app you may write it (at least neural network) in C++ using PyTorch's C++ frontend which is stable since 1.5.0
release.
Going this route would allow you to compile PyTorch's .cpp
source code statically (so all dependencies are linked) which allows you for relatively small binary size (30Mb
when compared to PyTorch's 1GB+
), but requires a lot of work.