arraysmatlabif-statementsearch

# Other ways to efficiently search within an array

I started with this code except the comments:

``````mol1...;mol2...;
r1 = size(e, 1);%number of candidates for aligned amino acid
r0 = size(e0,1);%number of candidates for reference amino acid
for i = 1 : r1
%if e(i, 1) > 4
for j = 1 : r0
%if e0(j, 1) > 4
if e(i, 1) == e0(j, 1)
eI(i, j) = e(i, 1);%number of atoms matched
eT(i, j) = abs((e(i, 2) - e0(j, 2)) / e0(j, 2) * 100);
end
%end
end
%end
end
``````

mol1 and mol2 are the combinations of selecting atoms and the total number: ex for a 3 atom molecule (1, 1,0,0) (1, 0,1,0) ... (3, 1,1,1). e and e0 are some numbers regarding geometry.

When I get to more atoms the size of the array can be 200 000. I thought that it wouldn't hurt to lose combinations of less than 5 atoms, but the code did not run faster. So the problem is with the ifs. Next I tried to delete combinations <5, keep the indexes and rebuild the initial array afterwards:

``````e (:, 7) = find(e (:, 1));
e0(:, 7) = find(e0(:, 1));
e (e (:, 4) < 5, :) = [];
e0(e0(:, 4) < 5, :) = [];
...
``````

This halved the time. I tic-toc-ed the code of 500 lines and the problem is here. It would take 2 years for 300 molecules (that I have chosen until now) and I would like to ad some more (20000).

So what other ways of scraping atoms in my array can you guys think of? Maybe I should decide for each size of molecules (15 atoms can scrap results of 5 atoms; 8-4). If changing the precision would reduce this time, how should I do it?

Version 2: cannot submit - webpage won't allow - see comments

Version 3 (50x faster than v2):

``````e(:,7)=find(e (:, 1));
e0(:,7)=find(e0 (:, 1));
[val,ia,ib]=intersect(e(:, 1), e0(:, 1));
for i = 1 : size(ia)
for j = 1 : size(ib)
eI(e(ia(i), 7), e0(ib(j), 7)) = e(ia(i), 1);
eT(e(ia(i), 7), e0(ib(j), 7)) = abs((e(ia(i), 2) - e0(ib(j), )) / e0(ib(j), 2) * 100);
end
end
``````

Solution

• Here is a vectorized form that computes the result without any for loop:

``````e_eq_e0 = e(:, 1) == e0(:, 1).';

eI = e_eq_e0 .* e(:, 1);
eT = e_eq_e0 .* abs((e(:, 2) ./ e0(:, 2).'  - 1) * 100);
``````

However the main problem in your code is that you don't pre-allocate the matrices `eI` and `eT` before using them:

``````eI = zeros(r1, r0);
eT = zeros(r1, r0);
for i = 1 : r1
....
``````