Search code examples
matlabdata-structuresmatrixdatasetcell-array

When to use a cell, matrix, or table in Matlab


I am fairly new to matlab and I am trying to figure out when it is best to use cells, tables, or matrixes to store sets of data and then work with the data.

What I want is to store data that has multiple lines that include strings and numbers and then want to work with the numbers.

For example a line would look like

'string 1' , time, number1, number 2

. I know a matrix works best if al elements are numbers, but when I use a cell I keep having to convert the numbers or strings to a matrix in order to work with them. I am running matlab 2012 so maybe that is a part of the problem. Any help is appreciated. Thanks!


Solution

  • Use a matrix when :

    • the tabular data has a uniform type (all are floating points like double, or integers like int32);
    • & either the amount of data is small, or is big and has static (predefined) size;
    • & you care about the speed of accessing data, or you need matrix operations performed on data, or some function requires the data organized as such.

    Use a cell array when:

    • the tabular data has heterogeneous type (mixed element types, "jagged" arrays etc.);
    • | there's a lot of data and has dynamic size;
    • | you need only indexing the data numerically (no algebraic operations);
    • | a function requires the data as such.

    Same argument for structs, only the indexing is by name, not by number.

    Not sure about tables, I don't think is offered by the language itself; might be an UDT that I don't know of...

    Later edit

    These three types may be combined, in the sense that cell arrays and structs may have matrices and cell arrays and structs as elements (because thy're heterogeneous containers). In your case, you might have 2 approaches, depending on how you need to access the data:

    • if you access the data mostly by row, then an array of N structs (one struct per row) with 4 fields (one field per column) would be the most effective in terms of performance;

    • if you access the data mostly by column, then a single struct with 4 fields (one field per column) would do; first field would be a cell array of strings for the first column, second field would be a cell array of strings or a 1D matrix of doubles depending on how you want to store you dates, the rest of the fields are 1D matrices of doubles.