Search code examples
pythonpex

Package python script with pandas using PEX


I have a simple python script that depends on pandas. I need to package it with pex so it's executed without dependency installation.

import sys
import csv 
import argparse
import pandas as pd 

class myLogic():
    def __init__(self):
        pass         

    def loadData(self, data_file):
        return pd.read_csv(data_file, delimiter="|")

    #command line interaction interface 
    def processInputArguments(self,args):

        parser = argparse.ArgumentParser(description="my logic")

        #transactions file name 
        parser.add_argument('-td',
                            '--data',
                            type=str,
                            dest='data',
                            help='data file location'
                            )       


        options = parser.parse_args(args)
        return vars(options)


    def main(self):
        options = self.processInputArguments(sys.argv[1:])

        data_file = options["data"]

        data = self.loadData(data_file)
        print data.head()


if __name__ == '__main__':
    ml = myLogic()
    ml.main()

I am trying to use pex to do that, so I did the following:

pex pandas -e myprogram.myLogic:main -o test1.pex 

But I am getting this error when running the generated pex file:

Traceback (most recent call last):
  File ".bootstrap/_pex/pex.py", line 317, in execute
  File ".bootstrap/_pex/pex.py", line 250, in _wrap_coverage
  File ".bootstrap/_pex/pex.py", line 282, in _wrap_profiling
  File ".bootstrap/_pex/pex.py", line 360, in _execute
  File ".bootstrap/_pex/pex.py", line 418, in execute_entry
  File ".bootstrap/_pex/pex.py", line 435, in execute_pkg_resources
  File ".bootstrap/pkg_resources.py", line 2088, in load
ImportError: No module named myLogic

I also tried packaging with the -c (switch for script) using the following command:

pex pandas -c myprogram.py -o test2.pex

But also getting an error:

Traceback (most recent call last):
  File "/usr/local/bin/pex", line 11, in <module>
    sys.exit(main())
  File "/usr/local/lib/python2.7/dist-packages/pex/bin/pex.py", line 509, in main
    pex_builder = build_pex(reqs, options, resolver_options_builder)
  File "/usr/local/lib/python2.7/dist-packages/pex/bin/pex.py", line 486, in build_pex
    pex_builder.set_script(options.script)
  File "/usr/local/lib/python2.7/dist-packages/pex/pex_builder.py", line 214, in set_script
    script, ', '.join(self._distributions)))
TypeError: sequence item 0: expected string, DistInfoDistribution found

Solution

  • The only option that worked for me up until now is creating an interpreter with pex that includes pandas and then shipping it with the python script. This can be done as follows:

    pex pandas -o my_interpreter.pex
    

    But this fails when the building python version is UCS4 and the version to run with is UCS2