I have a two dimension huge matrix (A). Each cell of this matrix is either empty or contains a word. I want to find the unique words in each row of this matrix separately and store them in another two dimension matrix (B) such that the Kth row of B contains the unique elements of the Kth row in A. I tried this way but it said that the input must be a cell array:
% engine
B = sort(A,2) ;
d = [true(1,size(B,2)) ; diff(B)>0] ;
B = mat2cell(B(d).',1,sum(d));
% check if B{K} contains the unique elements of the Kth row of A
for i=1:size(A,1),
tf(i) = isequal(B{i},unique(A(i,:))) ;
end
all(tf)
I would appreciate your help to solve this error.
You almost have it correct. If I understand your question correctly, you want to iterate over each row of your matrix, find unique words, and create a new row of cells in an output cell array that contain these words. Here is an example using a 3 x 5 matrix of cell elements:
A = 'Hi hi hi how are you my my name is Ray Ray Ray StackOverflow StackOverflow';
Acell = reshape(strsplit(A, ' '), 3, 5).'; % // Use for MATLAB R2013a and up
%//Acell = reshape(regexp(A, ' ', 'split'), 3, 5).'; %// Use for MATLAB R2012b and below
Here is what Acell
looks like:
Acell =
'Hi' 'hi' 'hi'
'how' 'are' 'you'
'my' 'my' 'name'
'is' 'Ray' 'Ray'
'Ray' 'StackOverflow' 'StackOverflow'
Now, let's insert some blank strings in the cell array to mimic your situation
Acell{1,1} = '';
Acell{4,1} = '';
Therefore:
Acell =
'' 'hi' 'hi'
'how' 'are' 'you'
'my' 'my' 'name'
'' 'Ray' 'Ray'
'Ray' 'StackOverflow' 'StackOverflow'
Now, let's initialize the matrix B
as a cell array that will store this output:
B = cell(size(Acell), 1);
This will have as many rows as Acell
will have. However, what will happen is that each row of B
will be uneven. As such, the only way that we can accomplish what you want down is that each element of B
will also be a cell array. MATLAB does not support creating matrices that have unequal columns per row. Now, we can simply loop through each row of A
, run unique
, then assign this to each row of B
:
for idx = 1 : size(Acell, 1)
B{idx} = unique(Acell(idx,:));
end
Now let's see B
:
B =
{1x2 cell}
{1x3 cell}
{1x2 cell}
{1x2 cell}
{1x2 cell}
Let's see each cell
by themselves:
for idx = 1 : numel(B)
disp(B{idx});
end
We thus get:
'' 'hi'
'are' 'how' 'you'
'my' 'name'
'' 'Ray'
'Ray' 'StackOverflow'
You'll notice that the words are also sorted in alphabetical order. That's how unique
orders things. Also note that unique
does not differentiate between upper case and lower case letters. As such, Hi
and hi
would count as different words. If this is not your desired behaviour and you want to filter out words where the case should not matter, convert all of the letters to lower case by using the lower
function before doing any processing. You can convert all of your strings to lower by using cellfun
Alower = cellfun(@lower, A, 'UniformOutput', false);
If you compare B
with Acell
, this gives you the unique words per row.