Search code examples
pythonhadoopthriftimpala

ImportError: No module named impyla


I have installed impyla and it's dependencies following this guide. The installation seems to be successful as now I can see the folder "impyla-0.13.8-py2.7.egg" in the Anaconda folder (64-bit Anaconda 4.1.1 version).

But when I import impyla in python, I get the following error:

>>> import impyla
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
ImportError: No module named impyla

I have installed 64-bit Python 2.7.12

Can any body please explain me why I am facing this error? I am new on Python and have been spending allot of time on different blogs, but I don't see much information present there yet. Thanks in advance for your time.


Solution

  • Usage is a little bit different then you mentioned (from https://github.com/cloudera/impyla)

    Impyla implements the Python DB API v2.0 (PEP 249) database interface (refer to it for API details):

    from impala.dbapi import connect
    conn = connect(host='my.host.com', port=21050)
    cursor = conn.cursor()
    cursor.execute('SELECT * FROM mytable LIMIT 100')
    print cursor.description  # prints the result set's schema
    results = cursor.fetchall()
    

    The Cursor object also exposes the iterator interface, which is buffered (controlled by cursor.arraysize):

    cursor.execute('SELECT * FROM mytable LIMIT 100')
    for row in cursor:
        process(row)
    

    You can also get back a pandas DataFrame object

    from impala.util import as_pandas
    df = as_pandas(cur)
    # carry df through scikit-learn, for example