Search code examples
pythonmathfloating-pointarbitrary-precisionquantization

General way to quantize floating point numbers into arbitrary number of bins?


I want to quantize a series of numbers which have a maximum and minimum value of X and Y respectively into arbitrary number of bins. For instance, if the maximum value of my array is 65535 and the minimum is 0 (do not assume these are all integers) and I want to quantize the values into 2 bins, all values more than floor(65535/2) would become 65535 and the rest become 0. Similar story repeats if I want to quantize the array from any number between 1 to 65535. I wonder, is there an efficient and easy way to do this? If not, how can I do this efficiently for number of bins being powers of 2? Although a pseudocode would be fine but Python + Numpy is preferred.


Solution

  • It's not the most elegant solution, but:

    MIN_VALUE = 0
    MAX_VALUE = 65535
    NO_BINS = 2   
    
    # Create random dataset from [0,65535] interval
    numbers = np.random.randint(0,65535+1,100)
    
    # Create bin edges
    bins = np.arange(0,65535, (MAX_VALUE-MIN_VALUE)/NO_BINS)
    
    # Get bin values
    _, bin_val = np.histogram(numbers, NO_BINS-1, range=(MIN_VALUE, MAX_VALUE))
    
    # Change the values to the bin value
    for iter_bin in range(1,NO_BINS+1):
        numbers[np.where(digits == iter_bin)] = bin_val[iter_bin-1]
    

    UPDATE

    Does the same job:

    import pandas as pd
    import numpy as np
    
    # or bin_labels = [i*((MAX_VALUE - MIN_VALUE) / (NO_BINS-1)) for i in range(NO_BINS)]
    _, bin_labels = np.histogram(numbers, NO_BINS-1, range=(MIN_VALUE, MAX_VALUE))
    
    pd.cut(numbers, NO_BINS, right=False, labels=bin_labels)