Search code examples
python-3.xdaskdask-delayed

Running a function on a slice of a dask array


I have been trying to figure out how to execute functions on slices of a dask array. For example if I create the following dask array:

import numpy as np
import dask.array as da
x = da.random.normal(10, 0.1, size=(200, 4),chunks=(100, 100)) 

and define the function:

#test function
def test(x,y,z=4):
    return x*y+z, z*y

executing

a,b = test(x[:,0],x[:,1])
a.compute()
b.compute()

works as expected, but if I try and assign these results back to x, the function fails:

x[:,0],x[:,1] = test(x[:,0],x[:,1])

throwing an NotImplementedError: Item assignment with not supported Is there a way I can work around this to do this kindve an operation? Thank You,


Solution

  • For Dask, mutation is not the normal workflow: you will want to make functions that that inputs and return new values, e.g.,

    def test(x,y,z=4):
        return x*y+z, z*y
    
    a, b = test(x[:, 0], x[:, 1])
    out = da.hstack([a.reshape(200, 1), b.reshape(200, 1), 
                     x[:, 2].reshape(200, 1), x[:, 3].reshape(200, 1)])
    

    (or

    out = da.vstack([a, b, x[:, 2], x[:, 3]]).T
    

    )