Search code examples
pythonpython-3.xapache-sparkspark-koalas

ImportError: cannot import name 'KoalasFrame' from 'databricks.koalas'


I got below warning and error

from databricks.koalas import KoalasFrame

WARNING:root:Found pyspark version "3.5.0" installed. The pyspark version 3.2 and above has a built-in "pandas APIs on Spark" module ported from Koalas. Try `import pyspark.pandas as ps` instead. 
WARNING:root:'PYARROW_IGNORE_TIMEZONE' environment variable was not set. It is required to set this environment variable to '1' in both driver and executor sides if you use pyarrow>=2.0.0. Koalas will set it for you but it does not work if there is a Spark context already launched.


ImportError                               Traceback (most recent call last)
cnrl\users\yongnual\Data\Spyder_workplace\DTS_dashboard\pandas2_high_performance_testing.ipynb Cell 18 line 1
----> 1 from databricks.koalas import KoalasFrame

ImportError: cannot import name 'KoalasFrame' from 'databricks.koalas' (c:\Anaconda\envs\dash2\lib\site-packages\databricks\koalas\__init__.py)

Solution

  • There is no module KoalasFrame in databricks.koalas

    pip install databricks
    pip install koalas
    pip install pyspark
    
    from databricks.koalas import KoalasFrame
    
    #Error
    ImportError: cannot import name 'KoalasFrame' from 'databricks.koalas'
    

    You can check the methods available in databricks.koalas by:

    dir(databricks.koalas)
    
    ['DataFrame',
     'Index',
     'LooseVersion',
     'MultiIndex',
     'NamedAgg',
     'Series',
     '__all__',
     '__builtins__',
     '__cached__',
     '__doc__',
     '__file__',
     '__loader__',
     '__name__',
     '__package__',
     '__path__',
     '__spec__',
     '__version__',
     '__warningregistry__',
     '_auto_patch',
     'assert_pyspark_version',
     'broadcast',
     'concat',
     'from_pandas',
     'get_dummies',
     'get_option',
     'groupby',
     'isna',
     'isnull',
     'melt',
     'merge',
     'namespace',
     'notna',
     'notnull',
     'option_context',
     'options',
     'os',
     'pandas_wraps',
     'pyarrow',
     'pyspark',
     'range',
     'read_clipboard',
     'read_csv',
     'read_delta',
     'read_excel',
     'read_html',
     'read_json',
     'read_parquet',
     'read_spark_io',
     'read_sql',
     'read_sql_query',
     'read_sql_table',
     'read_table',
     'reset_option',
     'set_option',
     'sql',
     'to_datetime',
     'to_numeric']
    

    I assume that you mean:

    from databricks.koalas import DataFrame
    

    https://koalas.readthedocs.io/en/latest/reference/api/databricks.koalas.DataFrame.html