Search code examples
pythonpandasdataframematrixcell

how to make Pandas DataFrame (Python) to display each cell in a two dimensional (2-D) matrix format


I am trying to create a two dimensional (2-D) data structure using a Matlab structure imported in Python.

When I use pandas.DataFrame, each cell contains a matrix, however, they are displayed in the List format. I am trying to change it to the Matrix format.

The DataFrame in Python would look similar using the following code: (However, it is not the same, since the real data is imported from Matlab and would have a different type which I could not recreate it using python)

import pandas as pd
k=[[0,1,2,3,4,5,6]]
df=pd.DataFrame(k)
df[:] = df[:].astype('object')
df.at[0,0] = [[1]]
df.at[0,1] = [[1.0,2.0],[2.0,4.0],[8.0,3.0],[9.0,7.0]]
df.at[0,2] = [[0.487],[1.532],[1.544],[1.846]]
df.at[0,3] = [[3.0]]
df.at[0,4] = [[3.0]]
df.at[0,5] = [[-1]]
df.at[0,6] = [[]]
display(df)

Which results in:

Result_of_the_code

(You can also find similar result by running the following snippet.)

<table border="1" class="dataframe">
  <thead>
    <tr style="text-align: right;">
      <th></th>
      <th>0</th>
      <th>1</th>
      <th>2</th>
      <th>3</th>
      <th>4</th>
      <th>5</th>
      <th>6</th>
    </tr>
  </thead>
  <tbody>
    <tr>
      <th>0</th>
      <td>[[1]]</td>
      <td>[[1.0, 2.0], [2.0, 4.0], [8.0, 3.0], [9.0, 7.0]]</td>
      <td>[[0.487], [1.5326], [1.544], [1.846]]</td>
      <td>[[3.0]]</td>
      <td>[[3.0]]</td>
      <td>[[-1]]</td>
      <td>[[]]</td>
    </tr>
  </tbody>
</table>

As you can see, each cell is displayed as a list, i.e:

Displayed_matrix_as_list_form

(You can also find similar result by running the following snippet.)

<body>
    [[1.0, 2.0], [2.0, 4.0], [8.0, 3.0], [9.0, 7.0]]
</body>

I am trying to change it to something like:

Intended_result

(You can also find similar result by running the following snippet.)

.matrix {
        position: relative;
    }
    .matrix:before, .matrix:after {
        content: "";
        position: absolute;
        top: 0;
        border: 1px solid #000;
        width: 6px;
        height: 100%;
    }
    .matrix:before {
        left: -10px;
        border-right: -0;
    }
    .matrix:after {
        right: -10px;
        border-left: 0;
    }
<div align=center>
  <table class="matrix">
    <tr>
      <td>1</td>
      <td>2</td>
    </tr>
    <tr>
      <td>2</td>
      <td>4</td>
    </tr>
    <tr>
      <td>8</td>
      <td>3</td>
    </tr>
    <tr>
      <td>9</td>
      <td>7</td>
    </tr>
  </table>
</div>

Thank you.


Solution

  • @Attack68, here is the code I mentioned in reply to your beautiful answer. just remember, as I mentioned, the real data are imported from a Matlab structure. Meaning it would not work with the data I provided in the question itself, but works fine with Matlab structures imported to python using scipy.io. I wrote this code with the help of @Valdi_Bo answer on link and @Paul Panzer answer on link.

    df = pd.DataFrame(data)
    
    import re 
    def pretty_col(data):
         data=np.array(data)
         if data.size <= 1:
             return format(data)
         else:
             return format(data[:, None])[1:-1].replace('[', '\u23A1', 1).replace(' [', '\u23A2', data.size-2).replace(' [', '\u23A3').replace(']', '\u23A4', 1).replace(']', '\u23A5', data.size-2).replace(']', '\u23A6')
    def pretty_cols(data, comma=False):
        if comma:
            f='\n'.join(line[0] + line + line[-1] for line in map(str.join, data.shape[0] // 2 * ('  ',) + (', ',) + (data.shape[0] - 1) // 2 * ('  ',), zip(*map(str.split, map(pretty_col, data.T), data.shape[1]*('\n',)))))
        else:
            f='\n'.join(line[0] + line + line[-1] for line in map(''.join, zip(*map(str.split, map(pretty_col, data.T), data.shape[1]*('\n',)))))
        return f
    
    def myFmt(txt):
        if txt=="":
            return "[]"
        else:
            q=r'<font">bananas\n</font>'
            q=q.replace("bananas", repr(txt))
            q=q.replace("'", '')
            return q.replace(r'\n', '<br>')
    def ttest(x):
        for i,k in enumerate(x):
            for j,l in enumerate(k):
                x[i][j]=float(format(l, '.2f'))
                return x
    
    def transform(tdf,prec):
        for col in tdf.columns:
            tdf[col] = tdf[col].apply(pretty_cols)
            for j in range(len(tdf[col])):
                tdf[col][j]=fixing_newline(tdf[col][j],prec)
    
    
    print(df[df.columns[0]])
    def fixing_newline(string,prec):
        string=string.replace("⎡⎡", ' aa ').replace("⎤⎤", ' bb ').replace("\n", ' cc ').replace("⎢⎢", ' dd ').replace("⎥⎥", ' ee ').replace("⎣⎣", ' ff ').replace("⎦⎦", ' gg ').replace("[[", ' hh ').replace("]]", ' kk ')
        chunks = string.split(' ')
        string=""
        for i,k in enumerate(chunks):
            try: 
                string+=str("{:."+str(prec)+"f}").format(float(k))
            except ValueError:
                string+=k
        string=string.replace("aa", "⎡⎡").replace("bb", '⎤⎤').replace("cc", '\n').replace("dd", '⎢⎢').replace("ee", '⎥⎥').replace("ff", '⎣⎣').replace("gg", '⎦⎦').replace("hh", '[[').replace("kk", ']]')
        return string
    
    
    transform(df,3)
    df=df.style.format(myFmt)
    
    display(df)
    

    which would result in something like: Results (Need help with inline images since I do not have enough reputation.)

    However, the code is not efficient at all, and also does not work well all the time.