Search code examples
matlabvariablescell

How to get new variable if value changes along the columns in Matlab


How can I get a new variable (Output) only with the rows in which you observe that the value changes along the years (columns). (e.g.: row 1)? (do not consider first column) Example

Input:

Title x '97 '99 '00 '01 '02 
%row1 13 189 189 39  39  39
      16 183 183 183 183 183
      18 76  76  76  28  28
      22 []  []  123 123 123
      25 12  12  12  []  []

Output:

  x '97 '99 '00 '01 '02 
  13 189 189 39  39  39
  18 76  76  76  28  28

Solution

  • I'm going to assume that your matrix is a 2D cell array. We will ignore the first column as you said. The basic procedure will be this:

    1. Create a new cell array that has as many rows as your original cell array.
    2. For each row of the original cell array, convert each row into a vector and place this as an element in the new cell array. The blank entries for each row will be ignored. We can't do this vectorized because each row is potentially uneven, and so a loop is required.
    3. For each row, use diff to find the difference between neighbouring elements in each row. If any of the entries are non-zero, then this is a row that we need. The output at this step will be a list of rows that match those that have no change in the columns
    4. Use the indices in Step #3 and subset the original cell array.

    Without further ado:

    %// Sample data
    A = {13 189 189 39  39  39
        16 183 183 183 183 183
        18 76  76  76  28  28
        22 []  []  123 123 123
        25 12  12  12  []  []};
    
    %// Create new cell array that contains
    %// the rows of your cell array converted into a vector
    B = cell(size(A,1),1);
    
    %// For each row, extract everything but the first column
    %// and convert to a vector
    for ind = 1 : size(A,1)
        B{ind} = cell2mat(A(ind,2:end));
    end
    
    %// Find those rows where we
    %// find at least one transition
    C = cellfun(@(x) any(diff(x) ~= 0), B);
    
    %// Subset your original cell matrix
    %// to just have these rows
    finalMat = A(C,:)
    

    Let's step through this code slowly. I create your original cell array as per your post. The next line of code creates a blank cell array where we have as many elements as there are rows in the original matrix. The next line of code has a for loop that goes through every row of your original matrix. For each row, ignore the first column and grab all of the other columns. This will extract all of the cells for each row. We want to convert this into a normal vector of elements, which is why cell2mat is used after we extract the cells for each row. This vector is then placed in the corresponding location in the blank cell array we created before.

    We have to do it this way because each row may potentially have empty elements and so if we were to do cell2mat on the entire 2D cell matrix, we would get an error with inconsistent dimensions. All of the rows must have the same number of columns. We then use cellfun to iterate through all of the cells from the blank (now populated) cell array. Each cell element will contain each row of the original cell matrix that we looked at, and all of the empty elements are ignored. For each cell element / row vector, we use diff to take differences of pairs of elements in the array. diff works like so: for a value at index i in your array x, the output at index i for the array y is:

    y_i = x_{i+1} - x_i
    

    Therefore, if the row vector has all of the same elements, then this should generate an entire array of zeroes. If there is at least one change, then this is a row that we need to extract, which is why we do any(diff(x) ~= 0). Should there be at least one change in the row vector, we output true or 1. If all the elements are the same, we output false or 0. the output of cellfun will then tell us whether or not a particular row is the one we are looking for or not. We then use this output to subset the original cell array to filter out those rows that have some transitions over all their columns.


    Input:

    A = 
    
     [13]    [189]    [189]    [ 39]    [ 39]    [ 39]
     [16]    [183]    [183]    [183]    [183]    [183]
     [18]    [ 76]    [ 76]    [ 76]    [ 28]    [ 28]
     [22]       []       []    [123]    [123]    [123]
     [25]    [ 12]    [ 12]    [ 12]       []       []
    

    Your output is stored in finalMat, and it looks like:

    finalMat = 
    
     [13]    [189]    [189]    [39]    [39]    [39]
     [18]    [ 76]    [ 76]    [76]    [28]    [28]