Search code examples
pythonexcelpysparkdatabricksspark-koalas

Column having no values gives the error 'can not infer schema' while reading excel to dataframe using koalas read_excel()


While reading excel file as dataframe using databricks koalas read_excel() with dtype as str, if a column is not having values it gives the error

can not infer schema from empty dataset

How to solve this issue? If I change the dtype to None, it will not throw error. But numeric data will be read in scientific form.

I tried writing converter:

converters={i : (lambda x: str(x) if x or x!='' else np.NaN) for i in range(col_count)}

(dtype=str is not working with converter, so removed). But this will read the string 'NA' as null. I want the data as it is in the source file.


Solution

  • Issue got resolved with the parameters dtype=str and na_filter = False while calling read_excel()