Search code examples
pythonpyinstallerpy2exeauto-py-to-exefeather

Reading/writing apache feather files from a compiled python script fails


I'm trying to create a script that converts a csv file to a feather file, then compile that script to a stand-alone exe which can be invoked from other applications supplying two parameters - path to a csv file and a path to the output feather file. Environment is Windows 10, Python 311. I've created, and successfully run, a script that does what's required when invoked from the command line like:

python arraowreadfeather.py C:\temp\Feather_test\data.csv C:\temp\Feather_test\data.feather

I've compiled the script to an exe using three tools freely available - py2exe, pyinstaller and auto-py-to-exe.

However, when I compile this script to an exe - either stand-alone or in a single folder - then run it from the command line I get this error:

C:\temp\csvtofeather\dist>arrowreadfeather.exe C:\temp\Feather_test\data.feather Calling from_feather with: C:\temp\Feather_test\data.feather Traceback (most recent call last): File "arrowreadfeather.py", line 28, in <module> File "arrowreadfeather.py", line 22, in main File "arrowreadfeather.py", line 7, in from_feather File "pyarrow\feather.pyc", line 226, in read_feather File "pyarrow\array.pxi", line 830, in pyarrow.lib._PandasConvertible.to_pandas File "pyarrow\table.pxi", line 3990, in pyarrow.lib.Table._to_pandas File "pyarrow\pandas_compat.pyc", line 810, in table_to_blockmanager File "pyarrow\pandas_compat.pyc", line 968, in _reconstruct_index File "pyarrow\pandas-shim.pxi", line 126, in pyarrow.lib._PandasAPIShim.pd.get File "pyarrow\pandas-shim.pxi", line 100, in pyarrow.lib._PandasAPIShim._check_import File "pyarrow\pandas-shim.pxi", line 56, in pyarrow.lib._PandasAPIShim._import_pandas ModuleNotFoundError: No module named 'pyarrow.vendored.version'

I simplified the script just to read an existing feather file and that gives the same error. Here is the simplified script and the error output:

`import sys
import pyarrow.feather as ft

def from_feather(_file: str):
    table = ft.read_feather(_file)
    return table

def main():
    if len(sys.argv) < 2:
        print("Insufficient args for conversion. Specify a source and dest.")
        return None
        
    print("Calling from_feather with: ",sys.argv[1])
    feather_file = sys.argv[1]
    d_table = from_feather(feather_file)

if __name__ == "__main__":
    main()`

C:\Python311\output>arrowreadfeather.exe C:\temp\Feather_test\data.feather Calling from_feather with: C:\temp\Feather_test\data.feather Traceback (most recent call last): File "arrowreadfeather.py", line 23, in File "arrowreadfeather.py", line 19, in main File "arrowreadfeather.py", line 5, in from_feather File "pyarrow\feather.py", line 226, in read_feather File "pyarrow\array.pxi", line 830, in pyarrow.lib._PandasConvertible.to_pandas File "pyarrow\table.pxi", line 3990, in pyarrow.lib.Table._to_pandas File "pyarrow\pandas_compat.py", line 810, in table_to_blockmanager File "pyarrow\pandas_compat.py", line 968, in _reconstruct_index File "pyarrow\pandas-shim.pxi", line 126, in pyarrow.lib._PandasAPIShim.pd.get File "pyarrow\pandas-shim.pxi", line 100, in pyarrow.lib._PandasAPIShim._check_import File "pyarrow\pandas-shim.pxi", line 56, in pyarrow.lib._PandasAPIShim._import_pandas ModuleNotFoundError: No module named 'pyarrow.vendored.version' [20312] Failed to execute script 'arrowreadfeather' due to unhandled exception!

Thanks in advance for any help/pointers.


Solution

  • The information above did solve my problem. Steps taken.

    1. Create a .spec file from the script you want to convert to an exe using this command:

      pyi-makespec --onefile yourscriptfile.py

    This creates a .spec file that pyinstaller uses to create the stand-alone exe.

    1. Edit the .spec file in a text editor. In my case I added the following details required to solve my issue:

      from PyInstaller.utils.hooks import collect_submodules allhiddenimports = collect_submodules('pyarrow.vendored')

    2. Edit the field 'hiddenimports' to read as follows:

      hiddenimports=allhiddenimports,

    3. Build the script using the .spec file with the following command: pyinstaller yourspecfile.spec

    This did indeed produce an executable that runs without dependency issues.

    I don't work on Python that often and needed both those pieces of info. above to get to a working solution. I would like to give joint kudos but...

    Thanks.