python string numpy matrix concatenation

Python: Concatenation of elements in different matrices

I have some issues in trying to concatenate strings in two different matrices between each other. Example:

The first matrix is dat defines as follows


dat = array([['data 1:1 ', 'data 2:2 '],
       ['data 1:1 ', 'data 2:2 '],
       ['data 1:1 ', 'data 2:2 ']], dtype='<U9')

The second matix is defined as:

sp = array([['sim_spec_A0_grp25.pha', 'sim_spec_B0_grp25.pha'],
       ['sim_spec_A1_grp25.pha', 'sim_spec_B1_grp25.pha'],
       ['sim_spec_A2_grp25.pha', 'sim_spec_B2_grp25.pha']], dtype='<U21')

I would like to obtain the following result:

my_matrix =  ['data 1:1 sim_spec_A0_grp25.pha', 'data 2:2 sim_spec_B0_grp25.pha'],
             ['data 1:1 sim_spec_A1_grp25.pha', 'data 2:2 sim_spec_B1_grp25.pha']
             ['data 1:1 sim_spec_A2_grp25.pha', 'data 2:2 sim_spec_B2_grp25.pha']]

I tried to implemet the following solution:

nset = 3
dim = 2

my_matrix = np.matrix(np.zeros((nset,dim)), dtype='str')

for i in range(0,nset):
    for j in range(0,dim):
        my_matrix[i][j]=dat[i][j]+sp[i][j]

obtaining the following error:


Traceback (most recent call last):

  File "<ipython-input-470-e62722962bc7>", line 5, in <module>
    my_matrix[i][j]=dat[i][j]+sp[i][j]

  File "/opt/anaconda3/lib/python3.7/site-packages/numpy/matrixlib/defmatrix.py", line 195, in __getitem__
    out = N.ndarray.__getitem__(self, index)

IndexError: index 1 is out of bounds for axis 0 with size 1

How can I solve the problem?

In addition, once the final matrix is built, is it possible to obtain a second matrix as follows?

Final =      ['data 1:1 sim_spec_A0_grp25.pha   data 2:2 sim_spec_B0_grp25.pha'],
             ['data 1:1 sim_spec_A1_grp25.pha   data 2:2 sim_spec_B1_grp25.pha']
             ['data 1:1 sim_spec_A2_grp25.pha  data 2:2 sim_spec_B2_grp25.pha']]

Thank you in advance for your help and availability!

Solution

This question contains mistakes only you can fix, such as missing brackets in the result.

Moreover, I don't understand why use numpy instead of native Python lists or pandas in this case.

Not arguing with that, here is a solution

import numpy as np

dat = np.array([['data 1:1 ', 'data 2:2 '],
       ['data 1:1 ', 'data 2:2 '],
       ['data 1:1 ', 'data 2:2 ']], dtype='<U9')

sp = np.array([['sim_spec_A0_grp25.pha', 'sim_spec_B0_grp25.pha'],
       ['sim_spec_A1_grp25.pha', 'sim_spec_B1_grp25.pha'],
       ['sim_spec_A2_grp25.pha', 'sim_spec_B2_grp25.pha']], dtype='<U21')

result = np.array([d + " " + s for d, s in zip(dat.flatten(), sp.flatten())])

print(result)

out:

['data 1:1  sim_spec_A0_grp25.pha' 'data 2:2  sim_spec_B0_grp25.pha'
 'data 1:1  sim_spec_A1_grp25.pha' 'data 2:2  sim_spec_B1_grp25.pha'
 'data 1:1  sim_spec_A2_grp25.pha' 'data 2:2  sim_spec_B2_grp25.pha']