Search code examples
pythonpandasnumpyvectorization

from table to (image) array using vectorized operations


Is there a way to get data from a table e.g.

axis0  axis1  value
0      0      2
1      1      1
0      1      3

into an (image) array

[[2 3]
[nan 1]]

using vectorized operations or something like that? And yes, in the data some points are missing and there should then be nan or equivalent.

Now I am doing that with a for loop inserting values into an array having all values nan but when there are more data, it is quite slow to process the data...


Solution

  • Option 1: Create a empty array with the desired shape then update the values in the arr using the information from dataframe

    x, y = df[['axis0', 'axis1']].max() + 1
    arr = np.full((x, y), np.nan)
    arr[df['axis0'], df['axis1']] = df['value']
    

    Option 2: Pivot the dataframe on axis columns then reindex to ensure the proper shape then convert to numpy

    x, y = df[['axis0', 'axis1']].max() + 1
    (
        df
        .pivot(index='axis0', columns='axis1', values='value')
        .reindex(index=range(x), columns=range(y))
        .to_numpy()
    )
    

    Result

    array([[ 2.,  3.],
           [nan,  1.]])