How to programmatically generate a dataset object from the Cartesian product (aka "cross-join") of multiple one-dimensional cell arrays?

I have n cell arrays c1,c2,…,cn, having dimensions L1 × 1,L2 × 1,…, Ln × 1, respectively. (FWIW, each cell array contains elements of a unique class, but this class may not be the same for all the arrays.)

I want to produce a dataset object representing the Cartesian product (aka "cross-join") of these n cell arrays.

I'm looking for a programmatic way to do this that will work for any n.

To be clear about what I mean by "Cartesian product" (or "cross-join"): I want to produce a dataset object containing n columns and L1 × L2 × … ×Ln rows, one row for each possible combination of an entry from c1, an entry from c2, …, an entry from cn - 1, and an entry from cn. (It's OK to assume that none of c1,c2,…,cn contains duplicate entries. IOW, one may assume that every ci is equal to unique(ci).)

An example where n = 3 is given below; the desired result is the dataset object factors. (Of course, the names of factors's columns represent an additional parameter. Also, in this example, all the cell arrays contain strings, but, as already mentioned, in general, the different arrays will contain entries of different classes.)

>> c1

c1 = 


>> c2

c2 = 


>> c3

c3 = 


>> factors

factors = 

    Parity        TrafficLight    Suit          
    'even'        'red'           'spades'      
    'even'        'red'           'hearts'      
    'even'        'red'           'diamonds'    
    'even'        'red'           'clubs'       
    'even'        'yellow'        'spades'      
    'even'        'yellow'        'hearts'      
    'even'        'yellow'        'diamonds'    
    'even'        'yellow'        'clubs'       
    'even'        'green'         'spades'      
    'even'        'green'         'hearts'      
    'even'        'green'         'diamonds'    
    'even'        'green'         'clubs'       
    'odd'         'red'           'spades'      
    'odd'         'red'           'hearts'      
    'odd'         'red'           'diamonds'    
    'odd'         'red'           'clubs'       
    'odd'         'yellow'        'spades'      
    'odd'         'yellow'        'hearts'      
    'odd'         'yellow'        'diamonds'    
    'odd'         'yellow'        'clubs'       
    'odd'         'green'         'spades'      
    'odd'         'green'         'hearts'      
    'odd'         'green'         'diamonds'    
    'odd'         'green'         'clubs'       


  • This works for

    • arbitrary number of cell arrays, n;
    • arbitrary size of each cell array;
    • arbitrary type of each cell's contents.

    It makes use of cellfun, arrayfun and comma-separated lists. The Cartesian product is computed on indices (not on actual elements) using ndgrid, with fliplr to yield the order you want (first column varies slowest, last column varies fastest).

    The result is given as a cell array with n columns. If you need it in the form of a dataset, define appropriate names and use cell2dataset to convert.

    c1 = {'even','odd'}; %// example data
    c2 = {'green','red','yellow'};
    c3 = {'clubs','diamonds','hearts','spades'};
    sets = {c1, c2, c3}; %// can have an arbirary number of c's
    num = numel(sets);
    nums = cellfun(@(c) numel(c), sets);
    inds = cell(1,num);
    vec = fliplr(arrayfun(@(n) 1:n, nums, 'uni', 0));
    [inds{:}] = ndgrid(vec{:});
    inds = fliplr(inds);
    factors = arrayfun(@(n) {sets{n}{inds{n}}},1:num, 'uni', 0);
    factors = cat(1, factors{:}).';


    >> factors
    factors = 
        'even'    'green'     'clubs'   
        'even'    'green'     'diamonds'
        'even'    'green'     'hearts'  
        'even'    'green'     'spades'  
        'even'    'red'       'clubs'   
        'even'    'red'       'diamonds'
        'even'    'red'       'hearts'  
        'even'    'red'       'spades'  
        'even'    'yellow'    'clubs'   
        'even'    'yellow'    'diamonds'
        'even'    'yellow'    'hearts'  
        'even'    'yellow'    'spades'  
        'odd'     'green'     'clubs'   
        'odd'     'green'     'diamonds'
        'odd'     'green'     'hearts'  
        'odd'     'green'     'spades'  
        'odd'     'red'       'clubs'   
        'odd'     'red'       'diamonds'
        'odd'     'red'       'hearts'  
        'odd'     'red'       'spades'  
        'odd'     'yellow'    'clubs'   
        'odd'     'yellow'    'diamonds'
        'odd'     'yellow'    'hearts'  
        'odd'     'yellow'    'spades'