Search code examples
python-3.xanacondapyinstallerpython-venvnuitka

PyInstaller and Nuitka generate ridiculously large files. How can the size be reduced?


I am using the Anaconda distribution of Python 3.6 on Windows and I wish to convert a simple python script to a standalone executable file. The problem is that the generated file is extremely large (~900mb) while using only a few external libraries.

More specifically, I use PyQt5 and pyqtgraph and some integrated python libraries such as sys, time, os and math. So far I have used PyInstaller and Nuitka, but I can't seem to be able to drastically reduce the executable file's size.

I noticed the existence of some mkl files that take up roughly 600mb of space. After removing those files, I was still able to run my program, seemingly without a hitch. I also noticed that there are two files named libopenblas which seem to be vital for the operation of the program and which amount to 100mb in total.

I have looked into this matter and I found similar questions on Stackoverflow and other sites. People claim that they were able to generate exe files using PyInstaller which were less than 40mb. It is being said that in order to achieve this size reduction, one should exclude all clutter libraries. However, I do not understand what qualifies as "clutter". For example, I tried excluding numpy and the program did not run, notwithstanding the fact that I wasn't directly using it in my program. Apparently, the libraries that are being used have some dependencies without which the program can't run.

Finally, I found this forum where it is suggested that a virtual environment be used instead of Anaconda. I tried setting it up using venv but I have trouble implementing it, as my Anaconda intallation interferes with it and does not allow me to install all the necessary libraries afresh.

No matter what I do, I always end up with at least 200mb worth of data. How can I get a functional executable that is less than 40mb of size? If a simple program like this produces such a large file, imagine what I will end up with if I decide to integrate other libraries such as tensorflow or scipy. It is not a viable solution and so far I haven't figured out a way to go around it. Any help is greartly appreciated.

EDIT: I tried installing Python from its official website and I removed Anaconda from Path. I ended up with a slightly smaller file which did not run though.


Solution

  • I realize this is old but I came across it today and have worked the same problem.

    A Pyinstaller hello world program will compile to about 12 Mb. However, as soon as you start doing anything substantial, numpy is required, which is large.

    I have one program that requires

    paramiko
    scp
    

    it compiles to 12 Mb.

    I have another that requires

    numpy
    tqdm
    nomkl
    matplotlib
    

    It compiles to ~200 Mb (zips to ~75 Mb). Using the nomkl package gets the installer to automatically keep the mkl libraries out. You have to remove numpy and then install nomkl and reinstall numpy. Without nomkl the above program was about 1 Gb.

    I recommend using conda environments rather than virtual environments if you have anaconda installed.