Search code examples
python-3.xpython-2.7scikit-learnbinningdiscretization

Sklearn Binning Process - It is possible to return a interval?


I'm trying to use KBinsDiscretizer from sklearn.preprocessing, but it returns integer values as 1,2,..,N (representing the interval). Is it possible to return a correct interval as (0.2, 0.5) or this is not implemented yet?


Solution

  • based on the docs: https://scikit-learn.org/stable/modules/generated/sklearn.preprocessing.KBinsDiscretizer.html:

    Attributes: n_bins_ : int array, shape (n_features,):

    Number of bins per feature. Bins whose width are too small (i.e., <= 1e-8) are removed with a warning. bin_edges_ : array of arrays,
    

    shape (n_features, ):

    The edges of each bin. Contain arrays of varying shapes (n_bins_, ) Ignored features will have empty arrays.
    

    This would mean a no in your case. There is also another hint:

    The inverse_transform function converts the binned data into the original feature space. Each value will be equal to the mean of the two bin edges.```