I am trying to convert a NxM matrix into a dataframe: for example, 2x2 matrix with row as A,B and column as 1,2, such as:
The cells inside each matrix contains 12 strings. I want to convert this matrix into 12 columns (same length as the strings), so it will look something like this, except with 12 columns
I tried using code below but no luck. Any help is appreciated!
sample_layout = pd.read_excel(file_name.xlxs,engine='openpyxl',sheet_name=1, skiprows=0) #skip the first row which is the indexes of the well
column_names = ['ID','sample_ID','sample_type','pilot_batch','tissue','hide_ID','hide_replicate','avg loaded weight mg','avg empty weight mg','net weight mg','date','operator']
df_converted = pd.DataFrame(sample_layout, columns=column_names)
Assuming the data is like in your example, you can do this:
import pandas as pd
df.melt()['value'].str.split(',', expand=True)
# similar alternative:
df.stack().str.split(',', expand=True)
# Output
0 1 2 3 4 5 6 7 8 9 10 11
0 19 SZ-0602-3- 1 sample 602 3 0602- 3 1 1640.605 1629.3 11.305 45133 SZ
1 20 SZ-0602-3- 2 sample 602 3 0602- 3 2 1628.315 1617.095 11.2 2 45133 SZ
2 19 SZ-0602-3- 1 sample 602 3 0602- 3 1 1640.605 1629.3 11.305 45133 SZ
3 20 SZ-0602-3- 2 sample 602 3 0602- 3 2 1628.315 1617.095 11.2 2 45133 SZ
# Reproducible data:
df = pd.DataFrame({
1: ["19,SZ-0602-3- 1, sample,602,3,0602- 3,1,1640.605,1629.3,11.305 ,45133,SZ", "20,SZ-0602-3- 2, sample,602,3,0602- 3,2,1628.315,1617.095,11.2 2,45133,SZ"],
2: ["19,SZ-0602-3- 1, sample,602,3,0602- 3,1,1640.605,1629.3,11.305 ,45133,SZ", "20,SZ-0602-3- 2, sample,602,3,0602- 3,2,1628.315,1617.095,11.2 2,45133,SZ"]
})