I own IPP 6, now I checked there is already IPP 8 available. Are there any benchmarks for comparing IPP 6, 7 and 8 on the newest CPUs? Particularly for 1D basic ops (mul, add, complex), FFT and IIR filtering.
You can do experiments yourself. IPP is supplied with performance measurement utility, usually "ps*.exe" in ipp\tools\perfsys directory. It's hard to say how it was at time of IPP 6.x, but it should be similar. The "ps*.exe" executable files allow to measure specific IPP function performance in terms of clocks-per-element (the lower the better, of course) for different CPU optimizations. The basic options for these perf. tests are "-?", "-e" shows all functions within test, "-T" turns on specific CPU optimization only, "-r" saves output into csv file.
Suppose, you want to measure ippsIIR64f_32s_Sfs function for AVX, SSE41 and SSE3 CPUs. You need to start ps_ipps.exe (which is 1D domain performance test) three times:
ps_ipps.exe -fippsIIR64f_32s_Sfs -B -R -TAVX (you'll get csv file with AVX optimization results)
ps_ipps.exe -fippsIIR64f_32s_Sfs -B -R -TSSE41 (SSE4.1 perf. data will be appended to csv)
ps_ipps.exe -fippsIIR64f_32s_Sfs -B -R -TSSE3" (SSE3 performance data will be appended).
Then grep csv file for required function/argument combination, e.g.
find "ippsIIR64f,32s,Sfs,32768,6,numBq_DF1" ps_ipps.csv
For example, I get
ippsIIR64f,32s,Sfs,32768,6,numBq_DF1,-,-,0,nLps=2048,1.30,cpMac,512,-
ippsIIR64f,32s,Sfs,32768,6,numBq_DF1,-,-,0,nLps=8,1.56,cpMac,613,-
ippsIIR64f,32s,Sfs,32768,6,numBq_DF1,-,-,0,nLps=4,5.61,cpMac,2.21e+003,-
That means, 5.61 clocks for SSE3, 1.56 clocks for SSE4.1 and 1.30 clocks for AVX. You CPU must support the highest instruction set, which you want to measure. As for IPP 7 and 8, you can download "try-and-buy" versions of Intel products (Composer or Parallel Studio) from Intel site to do benchmarks.