Search code examples
pythonpython-3.xpandaspandas-profiling

TypeCheckError: argument "config_file" (None) did not match any element in the union: pathlib.Path: is not an instance of pathlib.Path


I am getting an error when I tried to implement pandas profiling. Please find the code that I've tried, the error I got and the versions of the packages I used.

Code:

import pandas as pd
from pandas_profiling import ProfileReport
df = pd.read_csv("data.csv")
profile = ProfileReport(df)
profile

Error:

---------------------------------------------------------------------------
TypeCheckError                            Traceback (most recent call last)
Cell In[18], line 1
----> 1 profile = ProfileReport(df)
      2 profile

File ~\AppData\Local\anaconda3\lib\site-packages\pandas_profiling\profile_report.py:48, in ProfileReport.__init__(self, df, minimal, explorative, sensitive, dark_mode, orange_mode, tsmode, sortby, sample, config_file, lazy, typeset, summarizer, config, **kwargs)
     45 _json = None
     46 config: Settings
---> 48 def __init__(
     49     self,
     50     df: Optional[pd.DataFrame] = None,
     51     minimal: bool = False,
     52     explorative: bool = False,
     53     sensitive: bool = False,
     54     dark_mode: bool = False,
     55     orange_mode: bool = False,
     56     tsmode: bool = False,
     57     sortby: Optional[str] = None,
     58     sample: Optional[dict] = None,
     59     config_file: Union[Path, str] = None,
     60     lazy: bool = True,
     61     typeset: Optional[VisionsTypeset] = None,
     62     summarizer: Optional[BaseSummarizer] = None,
     63     config: Optional[Settings] = None,
     64     **kwargs,
     65 ):
     66     """Generate a ProfileReport based on a pandas DataFrame
     67 
     68     Config processing order (in case of duplicate entries, entries later in the order are retained):
   (...)
     82         **kwargs: other arguments, for valid arguments, check the default configuration file.
     83     """
     85     if df is None and not lazy:

File ~\AppData\Local\anaconda3\lib\site-packages\typeguard\_functions.py:138, in check_argument_types(func_name, arguments, memo)
    135         raise exc
    137 try:
--> 138     check_type_internal(value, annotation, memo)
    139 except TypeCheckError as exc:
    140     qualname = qualified_name(value, add_class_prefix=True)

File ~\AppData\Local\anaconda3\lib\site-packages\typeguard\_checkers.py:759, in check_type_internal(value, annotation, memo)
    757     checker = lookup_func(origin_type, args, extras)
    758     if checker:
--> 759         checker(value, origin_type, args, memo)
    760         return
    762 if isclass(origin_type):

File ~\AppData\Local\anaconda3\lib\site-packages\typeguard\_checkers.py:408, in check_union(value, origin_type, args, memo)
    403         errors[get_type_name(type_)] = exc
    405 formatted_errors = indent(
    406     "\n".join(f"{key}: {error}" for key, error in errors.items()), "  "
    407 )
--> 408 raise TypeCheckError(f"did not match any element in the union:\n{formatted_errors}")

TypeCheckError: argument "config_file" (None) did not match any element in the union:
  pathlib.Path: is not an instance of pathlib.Path
  str: is not an instance of str

Versions: pandas==1.5.3 pandas-profiling==3.6.6

Couldn't find any resource to debug this. Tried updating the versions of pandas and pandas-profiling, but still couldn't succeed.


Solution

  • This is a known issue:

    Install ydata-profiling with pip

    pip install ydata-profiling
    

    Just add import and use the existing code as is.

    import pandas as pd
    from ydata_profiling import ProfileReport
    
    df = pd.read_csv('file.csv')
    profile_report = ProfileReport(df)
    

    Link to document : https://pypi.org/project/pandas-profiling/