Search code examples
pythonnumpypandasimporterror

pandas ImportError C extension when io.py in same directory


Not sure if this is a pandas issue, or my lack of understanding with absolute/relative imports.

$ python -c "import pandas; print pandas.__version__"
0.17.1
$ python -V
Python 2.7.12 :: Anaconda 2.4.1 (x86_64)

# this runs fine (ie it doesn't raise exception)
$ mkdir x; echo "import pandas" > x/main.py; python x/main.py

# make io.py in same directory
$ touch x/io.py

# now it fails
$ python x/main.py
Traceback (most recent call last):
  File "x/main.py", line 1, in <module>
    import pandas
  File "/Users/GS/anaconda/lib/python2.7/site-packages/pandas/__init__.py", line 13, in <module>
    "extensions first.".format(module))
ImportError: C extension: StringIO not built. If you want to import pandas from
the source directory, you may need to run 'python setup.py build_ext --inplace'
to build the C extensions first.

I've also tried the same with a fresh install of python (not from anaconda), and the result was the same.

The same happens with other file names from the standard library, eg x/string.py or x/re.py.

Renaming the file to, for example, x/my_io.py as suggested in the question below is a workaround, but I'd like to understand why this is happening.

Why is this happening? What is the mechanism?


This seems to happen with numpy as well when there is a datetime.py file.

Here's a code snippet to test a bunch of combinations at once (I'm using (non-Anaconda) python here):

$ parallel -k -j1  "\
echo \"=== {1} : {2} ===\";\
rm -r x; \
mkdir -p x/y/z;\
echo \"import {1}; print \\\"ok\\\";\" > x/y/z/main.py;\
python x/y/z/main.py;\
touch x/y/z/{2}.py;\
python x/y/z/main.py;\
" ::: numpy pandas ::: os sys re io logging datetime
=== numpy : os ===
ok
ok
=== numpy : sys ===
ok
ok
=== numpy : re ===
ok
ok
=== numpy : io ===
ok
ok
=== numpy : logging ===
ok
ok
=== numpy : datetime ===
ok
Traceback (most recent call last):
  File "x/y/z/main.py", line 1, in <module>
    import numpy; print "ok";
  File "/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/numpy/__init__.py", line 180, in <module>
    from . import add_newdocs
  File "/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/numpy/add_newdocs.py", line 13, in <module>
    from numpy.lib import add_newdoc
  File "/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/numpy/lib/__init__.py", line 8, in <module>
    from .type_check import *
  File "/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/numpy/lib/type_check.py", line 11, in <module>
    import numpy.core.numeric as _nx
  File "/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/numpy/core/__init__.py", line 14, in <module>
    from . import multiarray
AttributeError: 'module' object has no attribute 'datetime_CAPI'
=== pandas : os ===
ok
ok
=== pandas : sys ===
ok
ok
=== pandas : re ===
ok
ok
=== pandas : io ===
ok
Traceback (most recent call last):
  File "x/y/z/main.py", line 1, in <module>
    import pandas; print "ok";
  File "/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/pandas/__init__.py", line 22, in <module>
    from pandas.compat.numpy import *
  File "/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/pandas/compat/__init__.py", line 350, in <module>
    from dateutil import parser as _date_parser
  File "/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/dateutil/parser.py", line 37, in <module>
    from io import StringIO
ImportError: cannot import name StringIO
=== pandas : logging ===
ok
Traceback (most recent call last):
  File "x/y/z/main.py", line 1, in <module>
    import pandas; print "ok";
  File "/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/pandas/__init__.py", line 43, in <module>
    from pandas.io.api import *
  File "/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/pandas/io/api.py", line 18, in <module>
    from pandas.io.gbq import read_gbq
  File "/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/pandas/io/gbq.py", line 59, in <module>
    logger = logging.getLogger('pandas.io.gbq')
AttributeError: 'module' object has no attribute 'getLogger'
=== pandas : datetime ===
ok
Traceback (most recent call last):
  File "x/y/z/main.py", line 1, in <module>
    import pandas; print "ok";
  File "/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/pandas/__init__.py", line 13, in <module>
    __import__(dependency)
  File "/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/numpy/__init__.py", line 180, in <module>
    from . import add_newdocs
  File "/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/numpy/add_newdocs.py", line 13, in <module>
    from numpy.lib import add_newdoc
  File "/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/numpy/lib/__init__.py", line 8, in <module>
    from .type_check import *
  File "/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/numpy/lib/type_check.py", line 11, in <module>
    import numpy.core.numeric as _nx
  File "/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/numpy/core/__init__.py", line 14, in <module>
    from . import multiarray
AttributeError: 'module' object has no attribute 'datetime_CAPI'

Solution

  • By default, the first element of sys.path is an empty string, which means the directory of the top-level script. So if you have any modules in that directory with the same name as a standard library module, they will override the standard module.