I have a matrix which has multiple rows per ID. I need to extract only say..4 LAST rows per ID. ID cols consists of 2 Cols:
mat = [ ...
2222 1 734771 0.11
2222 1 734772 0.22
2222 1 734773 0.33
2222 1 734774 0.44
2222 1 734775 0.55
5555 3 734771 0.11
5555 1 734772 0.12
5555 1 734773 0.13
] ;
answer --> % [ 2222 1 734772 0.22 ; 2222 1 734773 0.33 ; 2222 1 734774 0.44 ; 2222 1 734775 0.55]
I'll appreciate a vectorized approach.
Here is a efficient solution using UNIQUE and ACCUMARRAY functions:
%# sample matrix, sorted by first column
mat = [randi([1 5],[20 1]) (1:20)' rand(20,1)]; %'
mat = sortrows(mat);
%# extract last N rows from each unique ID
N = 4;
[~,~,subs] = unique( mat(:,1) ); %# index into unique values
fcn = @(x){ x(max(1,end-N+1):end) }; %# at most last N values
ind = accumarray(subs, 1:numel(subs), [max(subs) 1], fcn);
ind = cell2mat( ind(cellfun(@numel,ind) >= 4) ); %# keep only those with 4+ count
result = mat(ind,:)
Using the third output of the UNIQUE function, we get for each row the index into the list of unique IDs. Then we distribute the row indices into bins (cell array) based on those subscripts. We take the last four indices from each bin, and filter those with less than 4 occurrences. We then combine them all into one vector of row indices. Finally we use it to get the corresponding rows from the original matrix.
Using the example above, we generate the following matrix:
mat =
1 2 0.70199
1 6 0.46313
1 7 0.98821
1 12 0.15645
1 13 0.67037
1 16 0.86966
2 8 0.63491
2 9 0.076501
2 15 0.55076
2 17 0.44727
2 19 0.30587
3 5 0.91502
3 10 0.97322
3 20 0.48231
4 3 0.45633
4 4 0.12363
4 11 0.18319
4 14 0.36045
5 1 0.36708
5 18 0.63084
and the result was:
result =
1 7 0.98821
1 12 0.15645
1 13 0.67037
1 16 0.86966
2 9 0.076501
2 15 0.55076
2 17 0.44727
2 19 0.30587
4 3 0.45633
4 4 0.12363
4 11 0.18319
4 14 0.36045