I want to quantize a series of numbers which have a maximum and minimum value of X
and Y
respectively into arbitrary number of bins. For instance, if the maximum value of my array is 65535
and the minimum is 0
(do not assume these are all integers) and I want to quantize the values into 2
bins, all values more than floor(65535/2)
would become 65535
and the rest become 0
. Similar story repeats if I want to quantize the array from any number between 1
to 65535
. I wonder, is there an efficient and easy way to do this? If not, how can I do this efficiently for number of bins being powers of 2
? Although a pseudocode would be fine but Python + Numpy is preferred.
It's not the most elegant solution, but:
MIN_VALUE = 0
MAX_VALUE = 65535
NO_BINS = 2
# Create random dataset from [0,65535] interval
numbers = np.random.randint(0,65535+1,100)
# Create bin edges
bins = np.arange(0,65535, (MAX_VALUE-MIN_VALUE)/NO_BINS)
# Get bin values
_, bin_val = np.histogram(numbers, NO_BINS-1, range=(MIN_VALUE, MAX_VALUE))
# Change the values to the bin value
for iter_bin in range(1,NO_BINS+1):
numbers[np.where(digits == iter_bin)] = bin_val[iter_bin-1]
UPDATE
Does the same job:
import pandas as pd
import numpy as np
# or bin_labels = [i*((MAX_VALUE - MIN_VALUE) / (NO_BINS-1)) for i in range(NO_BINS)]
_, bin_labels = np.histogram(numbers, NO_BINS-1, range=(MIN_VALUE, MAX_VALUE))
pd.cut(numbers, NO_BINS, right=False, labels=bin_labels)