Storing CDF in single dataframe cell or several columns

I have 10 item pairs, call them 1A and 1B, 2A and 2B, 3A and 3B -> 10A and 10B in a frame like this:

Item_col1    Item_col2   
1A           1B         
2A           2B         
3A           3B

Each item (e.g.; 2A) has an associated Cumulative Probability Distribution Function. Each CDF I have stored in a list of np.arrays [CDF_A1, CDF_2A, CDF_3A, CDF_4A], each has 100 elements and look like a little like this:

[0.0000, 0.0100, 0.2000,...0.9999, 1.0]

I'd like to add the CDFs to the frame, ultimately to compare to each other (e.g.; 1A compared to 1B, 2A to 2B) but am at a loss on the best way store them in the frame.

Would it be better to (and is it possible?) to store them like this:

Item_col1    Item_col2    CDF_Item_col1    CDF_Item_col2
1A           1B           CDF_1A           CDF_1B
2A           2B           CDF_2A           CDF_2B
3A           3B           CDF_3A           CDF_3B

OR should it be or does it have to be like this:

Item_col1    Item_col2 (As) CDF_Element1    CDF_Element2....CDF_Element100   (Bs) CDF_Element1    CDF_Element2....CDF_Element100 
1A           1B             0.0000          0.0100          1.0000                0.0000          0.0100          1.0000    
2A           2B             0.0000          0.0100          1.0000                0.0000          0.0100          1.0000
3A           3B             0.0000          0.0100          1.0000                0.0000          0.0100          1.0000

Solution

I think you can store them some way like this:

df
   item1 item2      cdfA      cdfB
0     1A    1B  0.574843  0.501655
1     1A    1B  0.574843  0.638855
2     1A    1B  0.574843  0.827372
3     1A    1B  0.574843  0.450464
4     1A    1B  0.162894  0.501655
5     1A    1B  0.162894  0.638855
6     1A    1B  0.162894  0.827372
7     1A    1B  0.162894  0.450464
8     1A    1B  0.479719  0.501655
9     1A    1B  0.479719  0.638855
10    1A    1B  0.479719  0.827372
11    1A    1B  0.479719  0.450464
12    1A    1B  0.724478  0.501655
13    1A    1B  0.724478  0.638855
14    1A    1B  0.724478  0.827372
15    1A    1B  0.724478  0.450464
16    2A    2B  0.827809  0.709354
17    2A    2B  0.827809  0.657139
18    2A    2B  0.827809  0.115151
19    2A    2B  0.827809  0.942483
20    2A    2B  0.717945  0.709354

As you said, you may further want to compare the values of these CDF as between 1A and 1B, 2A and 2B, .. and so on, if you have your dataframe this way, I think it will be easier for you later to make those comparisons. If you think it is going to occupy more RAM, you can even change item1 and item2 columns to Categorical since they are repeating, as

cols = ['item1', 'item2']
for col in cols:
    df[col] = df[col].astype('category')