I would like to downsample a 3d array by taking the most frequent value (mode) of the original values. After some research, I found the block_reduce
function in skimage
library. For example, if I wanted like to take the average of the block, I can do it easily:
from skimage.measure import block_reduce
image = np.arange(4*4*4).reshape(4, 4, 4)
new_image = block_reduce(image, block_size=(2,2,2), func=np.mean, cval=np.mean(grades))
In my case, I want to pass the func argument a mode function. However, numpy doesn't have a mode function. According to the documentation, the passed function should accept 'axis' as an argument. I tried some workarounds such as writing my own function and combining np.unique
and np.argmax
, as well as passing scipy.stats.mode
as the function. All of them failed.
I wrote some nested for loops to do this but it's way too slow with large arrays. Is there an easy way to do this? I don't necessarily need to use sci-kit image library.
Thank you in advance.
Let's start with the assumption that the input image shape is divisible by block_size
, i.e. corresponding shape dimensions are divisible by each size parameter of block_size
.
So, as pre-processing, we need to make blocks out off the input image
, like so -
def blockify(image, block_size):
shp = image.shape
out_shp = [s//b for s,b in zip(shp, block_size)]
reshape_shp = np.c_[out_shp,block_size].ravel()
nC = np.prod(block_size)
return image.reshape(reshape_shp).transpose(0,2,4,1,3,5).reshape(-1,nC)
Next up, for our specific case of mode
finding, we will use bincount2D_vectorized
alongwith argmax
-
# https://stackoverflow.com/a/46256361/ @Divakar
def bincount2D_vectorized(a):
N = a.max()+1
a_offs = a + np.arange(a.shape[0])[:,None]*N
return np.bincount(a_offs.ravel(), minlength=a.shape[0]*N).reshape(-1,N)
out = bincount2D_vectorized(blockify(image, block_size=(2,2,2))).argmax(1)
Alternatively, we can use n-dim mode
-
out = mode(blockify(image, block_size=(2,2,2)), axis=1)[0]
Finally, if the initial assumption of divisibility doesn't hold true, we need to pad with the appropriate pad value. For the same, we can use np.pad
, as part of the blockify
method.