Search code examples
pythonstringnumpymatrixconcatenation

Python: Concatenation of elements in different matrices


I have some issues in trying to concatenate strings in two different matrices between each other. Example:

  • The first matrix is dat defines as follows

dat = array([['data 1:1 ', 'data 2:2 '],
       ['data 1:1 ', 'data 2:2 '],
       ['data 1:1 ', 'data 2:2 ']], dtype='<U9')
  • The second matix is defined as:
sp = array([['sim_spec_A0_grp25.pha', 'sim_spec_B0_grp25.pha'],
       ['sim_spec_A1_grp25.pha', 'sim_spec_B1_grp25.pha'],
       ['sim_spec_A2_grp25.pha', 'sim_spec_B2_grp25.pha']], dtype='<U21')

I would like to obtain the following result:

my_matrix =  ['data 1:1 sim_spec_A0_grp25.pha', 'data 2:2 sim_spec_B0_grp25.pha'],
             ['data 1:1 sim_spec_A1_grp25.pha', 'data 2:2 sim_spec_B1_grp25.pha']
             ['data 1:1 sim_spec_A2_grp25.pha', 'data 2:2 sim_spec_B2_grp25.pha']]

I tried to implemet the following solution:

nset = 3
dim = 2

my_matrix = np.matrix(np.zeros((nset,dim)), dtype='str')

for i in range(0,nset):
    for j in range(0,dim):
        my_matrix[i][j]=dat[i][j]+sp[i][j]

obtaining the following error:


Traceback (most recent call last):

  File "<ipython-input-470-e62722962bc7>", line 5, in <module>
    my_matrix[i][j]=dat[i][j]+sp[i][j]

  File "/opt/anaconda3/lib/python3.7/site-packages/numpy/matrixlib/defmatrix.py", line 195, in __getitem__
    out = N.ndarray.__getitem__(self, index)

IndexError: index 1 is out of bounds for axis 0 with size 1

How can I solve the problem?

In addition, once the final matrix is built, is it possible to obtain a second matrix as follows?

Final =      ['data 1:1 sim_spec_A0_grp25.pha   data 2:2 sim_spec_B0_grp25.pha'],
             ['data 1:1 sim_spec_A1_grp25.pha   data 2:2 sim_spec_B1_grp25.pha']
             ['data 1:1 sim_spec_A2_grp25.pha  data 2:2 sim_spec_B2_grp25.pha']]

Thank you in advance for your help and availability!


Solution

  • This question contains mistakes only you can fix, such as missing brackets in the result.

    Moreover, I don't understand why use numpy instead of native Python lists or pandas in this case.


    Not arguing with that, here is a solution

    import numpy as np
    
    dat = np.array([['data 1:1 ', 'data 2:2 '],
           ['data 1:1 ', 'data 2:2 '],
           ['data 1:1 ', 'data 2:2 ']], dtype='<U9')
    
    sp = np.array([['sim_spec_A0_grp25.pha', 'sim_spec_B0_grp25.pha'],
           ['sim_spec_A1_grp25.pha', 'sim_spec_B1_grp25.pha'],
           ['sim_spec_A2_grp25.pha', 'sim_spec_B2_grp25.pha']], dtype='<U21')
    
    result = np.array([d + " " + s for d, s in zip(dat.flatten(), sp.flatten())])
    
    print(result)
    

    out:

    ['data 1:1  sim_spec_A0_grp25.pha' 'data 2:2  sim_spec_B0_grp25.pha'
     'data 1:1  sim_spec_A1_grp25.pha' 'data 2:2  sim_spec_B1_grp25.pha'
     'data 1:1  sim_spec_A2_grp25.pha' 'data 2:2  sim_spec_B2_grp25.pha']