Search code examples
cshared-memorysparse-matrix

Parallel array or array of structures


I am trying to implement a sparse matrix (COO format) framework in C for parallel computing (shared memory). Initially I was planning to have an array of struct of the spatial information.

    typedef struct {
    unsigned int rowIdx;  \\ Row Index
    unsigned int colIdx;  \\ Col Index
    unsigned int dataVal; \\ Value
    } entity, *spMat;

How does parallel array perform for the same ?


Solution

  • This largely depends on how you intend to implement the solution. If you want to take advantage of data parallel features of the CPU or GPU then you might well be better off implementing this as a struct of arrays than an array of structs.

    typedef struct {
      unsigned int* rowIdxs;
      unsigned int* colIdxs;
      unsigned int* dataValues;
    } entity, *spMat;
    

    This will make it easier to write code that either the CPU compiler's vectorizor or the GPU's compiler can use efficiently. So in this case I would probably use an struct of arrays first and optimize for data parallel(ness).

    That being said it will largely depend on how good your implementation is. it would be possible to write a poorly performing implementation with either approach.