Search code examples
pythontypeerrorvisualizationdata-analysis

Showing integer columns as categorical and throwing error in sweetviz compare


If I analyze these two datasets individually, I don't get any error and the I also get the viz of all the integer columns. But when I try to compare these dataframe, I get the below error.

Cannot convert series 'Web Visit' in COMPARED from its TYPE_CATEGORICAL to the desired type TYPE_BOOL.

I also tried the FeatureConfig to skip it, but no avail.

pid_compare = sweetviz.compare([pdf,"234_7551009"],[pdf_2,"215_220941058"])


Solution

  • Maintainer of the lib here; this question was asked in the git also, but it will be useful to detail the answer here.

    After looking at your data provided in the link above, it looks like the first dataframe (pdf) only contains 0 & 1, so it is classified as boolean so it cannot be compared against the second one which is categorical (that one has 0,1,2,3 as you probably know!).

    The system will be able to handle it if you use FeatureConfig to force the first dataframe to be considered CATEGORICAL.

    I just tried the following and it seems to work, let me know if it helps!

    feature_config = sweetviz.FeatureConfig(force_cat = ["Web Desktop Interaction"])
    report = sweetviz.compare(pdf, pdf_2, None, feature_config)