Search code examples
pythonarraysnumpydimensionsbrackets

Removing length 1 dimensions in arrays


I'm a real beginner with Python, and I have a recurrent problem with my ndarrays. I'm very confused with the brackets (is there any schematic synthesis of the use of brackets in Python anywhere?). I always end up having arrays with many dimensions. Right now I have this one:

>>> values
Out[1]: 
array([[[ array([[ 4.23156519, -0.93539198],
       [ 3.50074853, -1.67043386],
       [ 4.64192393, -1.03918172],
       [ 4.52056725,  0.2561267 ],
       [ 3.36400016,  0.26435125],
       [ 3.82025672,  1.16503286]])]]], dtype=object)

From here, how can I reduce the dimensions? I just wanted a 6x2 array. I tried np.reshape but since the current shape of values is (1,1,1) I can't directly reshape the array in a 6x2 one.

I'm sorry for the silly question, I'm seeking a general and schematic answer that would explain me how to pass from a higher dimension to a lower one and vice versa.

Here is the way I created the array. values is clustered_points

indices=[] # initialize indices
clustered_points=[] # initialize array containing points in different sub-arrays=clusters
for k in range(len(mu)):
    a=r[:,k]
    index=[t for t in range(len(a)) if a[t] == 1]
    indices.append(index)
    clustered_points.append(data[indices[k]])
clustered_points=np.reshape(clustered_points,(len(clustered_points),1,1))

Solution

  • To make an array that matches your initial display, I have to take special care to embed one array within another:

    In [402]: x=np.array([[ 4.23156519, -0.93539198],
           [ 3.50074853, -1.67043386],
           [ 4.64192393, -1.03918172],
           [ 4.52056725,  0.2561267 ],
           [ 3.36400016,  0.26435125],
           [ 3.82025672,  1.16503286]])
    In [403]: a=array([[[None]]],dtype=object)
    In [404]: a[0,0,0]=x
    In [405]: a
    Out[405]: 
    array([[[ array([[ 4.23156519, -0.93539198],
           [ 3.50074853, -1.67043386],
           [ 4.64192393, -1.03918172],
           [ 4.52056725,  0.2561267 ],
           [ 3.36400016,  0.26435125],
           [ 3.82025672,  1.16503286]])]]], dtype=object)
    
    In [406]: a.shape
    Out[406]: (1, 1, 1)
    In [407]: a[0,0,0].shape
    Out[407]: (6, 2)
    

    Simply doing a cut-n-paste from the display produces a different array with shape (1,1,1,6,2). That does not have the inner array marking. Either way a[0,0,0] gives the inner (6,2) array.

    reshape and squeeze work on a (1,1,1,6,2) array, but not on a (6,2) nested inside a (1,1,1). You need to understand the difference.


    (edit)

    To run your 'how I did it' clip, I have to make some guesses as to the inputs (that almost merits a downvote).

    I'll guess at some inputs:

    In [420]: mu=np.arange(3); r=np.ones((4,3));data=np.ones(5)
    In [421]: %paste
    indices=[] # initialize indices
    clustered_points=[] # initialize array containing points in different sub-arrays=clusters
    for k in range(len(mu)):
        a=r[:,k]
        index=[t for t in range(len(a)) if a[t] == 1]
        indices.append(index)
        clustered_points.append(data[indices[k]])
    
    ## -- End pasted text --
    In [422]: clustered_points
    Out[422]: 
    [array([ 1.,  1.,  1.,  1.]),
     array([ 1.,  1.,  1.,  1.]),
     array([ 1.,  1.,  1.,  1.])]
    

    cluster_points is a list with several 1d arrays.

    I can do

    np.reshape(clustered_points,(12,1,1))
    np.reshape(clustered_points,(3,4,1,1))
    

    though it would be better, I think, to do np.array(clustered_points) first, and may be even check its shape.

    Since

    np.reshape(clustered_points,(len(clustered_points),1,1))
    

    supposedly works then clustered_points must be a list of n single element arrays. But this reshape should produce a (n,1,1) array, not your (1,1,1,...) array.

    So that edit doesn't help.

    =========================

    I'm seeking a general and schematic answer that would explain me how to pass from a higher dimension to a lower one and vice versa.

    The first step is be clear, to yourself and others, what is the structure of your array. That includes knowing shape and dtype. And if the dtype is anything other than simple numerics, pay attention to the structure of the elements (e.g. objects within the array).

    Singular dimensions (value 1) can be removed with indexing, [0], or squeeze. reshape also removes demensions (or adds them), but you have to pay attention to the total number of elements. If the old shape had 12 elements, the new has to have 12 as well. But reshape does not operate across dtype boundaries.