First of all, it is a request not to mark this question as duplicate as it will be clear once it is studied in detail.
I am trying to implement Orthogonal Matching Pursuit algorithm. For that I need to find the dot product of two matrices of size 144*14596 and 144*1 as shown below
clc,clear;
load('E');
load('R');
load('P');
sparse=zeros(14596,2209);
dictionary=tem2;
atoms=zeros(size(dictionary,1),size(dictionary,2));
coefs=zeros(size(dictionary,2),1);
tic
%Normalize the dictionary
for index=1:size(dictionary,2)
dictionary(:,index)=dictionary(:,index)./norm(dictionary(:,index));
end
D=dictionary;
/* NOTE: I tried for ii=1:5 to check the difference in computational time*/
for ii=1:2209
r=tem4(:,ii);
dictionary=D;
index=[];
count=0;
t=5;
while(t>1e-15 && count~=144)
/***************Problem lies here**************/
% inner_product=dictionary'*r; %Dot Product (Should be slow but is fast)
inner_product=dotProduct(dictionary',r); %(Should be fast but is very slow)
/****************************************************/
[m,ind]=max(abs(inner_product));
index=[index ind];
atoms(:,ind)=dictionary(:,ind); %Select atom which has maximum inner product
dictionary(:,ind)=0;
at=atoms(:,index);
x=(at'*at)\(at'*r);
coefs(index)=x;
r=r-at*x;
t=norm(r);
count=count+1;
end
sparse(:,ii)=coefs;
end
sig=D*sparse;
final=uint8((repmat((((max(tem4))-min(tem4))./((max(sig)-min(sig)))),size(tem4,1),1).*(sig-repmat(min(sig),size(tem4,1),1)))+repmat(min(tem4),size(tem4,1),1));
toc
but the problem I am facing is that it takes a lot of time (as seen in the profiler report) in finding out the dot product using the following code in MATLAB.
inner_product=dictionary'*r;
In order to reduce the computational time, I wrote the MEX code as shown below to find out the dot product:
/***********************************************************************
*Program to create a MEX-file to find the dot product of matrices *
*Created by: Navdeep Singh *
*@Copyright Reserved *
***********************************************************************/
#include "mex.h"
void dot_prod(double *m1,double *m2, double *t,size_t M,size_t N, size_t M2,size_t N2 )
{
int i,j,k;
double s;
for(i=0;i<M;i++)
{ for(k=0;k<N2;k++)
{ s=0;
for(j=0;j<N;j++)
{ s=s+*((m1+i)+(M*j))*(*(m2+(j+M2*k)));
}
*((t+i)+(M*k))=s;
}
}
}
void mexFunction(int nlhs,mxArray *plhs[],int nrhs, const mxArray *prhs[])
{ double *mat1,*mat2,*out;
size_t rows_mat1,cols_mat1,rows_mat2,cols_mat2;
mat1=mxGetPr(prhs[0]);
mat2=mxGetPr(prhs[1]);
rows_mat1=mxGetM(prhs[0]);
cols_mat1=mxGetN(prhs[0]);
rows_mat2=mxGetM(prhs[1]);
cols_mat2=mxGetN(prhs[1]);
plhs[0]=mxCreateDoubleMatrix(rows_mat1,cols_mat2,mxREAL);
out=mxGetPr(plhs[0]);
dot_prod(mat1,mat2,out,rows_mat1,cols_mat1,rows_mat2,cols_mat2);
}
But to my surprise I found out that MEX solution is much much slower than the one used in MATLAB, which defeats the ultimate purpose of MEX. To know the cause I searched a lot on internet and found some interesting facts such as :
Matlab: Does calling the same mex function repeatedly from a loop incur too much overhead?
Matlab mex-file with mexCallMATLAB is almost 300 times slower than the corresponding m-file
These links suggests that overhead should not be much and if there is some it is always for the first call, as time is needed to load symbol tables etc . -- But contrary to this I found that a lot of overhead is incurring in my code.
Also I found out that size of arguments does not matter though number of arguments can affect the computational time but it is again minimal. One of the link also suggest that dynamically allocated memory should be freed (apart from the one allocated by matlab itself) but I don't have any such kind of allocation also.
So kindly let me know what is the reason behind
why MEX is taking huge amount of time?
What can be the solution to it?
Your help is really appreciated.
Various files can be found here:
Matlab has highly optimized code to calculate dot product of to matrices,
you just wrote a nested for loop to calculate the dot product , so you could compare this Mex code just with "similar nested for loops" in matlab then decide whether MEX code is faster or matlab ,
in fact matlab does not use nested for loop to calculate dot product of matrices,
from MATLAB doc:
MEX-Files have several applications:
calling large pre-existing c/c++ and FORTRAN programs from MATLAB without rewriting them as MATLAB functions
Replacing performance-critical routines with c/c++ implementations
MEX files are not appropriate for all applications. MATLAB is a high-productivity environment whose specialty is eliminating time-consuming, low-level programming in compiled languages like C or C++. In general, do your programming in MATLAB. Do not use MEX files unless your application requires it.