Search code examples
Why use push/pop instead of sub and mov?...


assemblyx86x86-64cpu-architecturemicro-optimization

Read More
Fastest way to initialize a __m128i constant with intrinsics?...


cvisual-c++sseintrinsicsmicro-optimization

Read More
How to copy a register and do `x*4 + constant` with the minimum number of instructions...


assemblyx86micro-optimization

Read More
latency for 'pcmpeqb' - memory vs xmm register...


assemblyoptimizationssemicro-optimizationsse2

Read More
Difference between n = 0 and n = n - n...


cassemblyoptimizationcompiler-constructionmicro-optimization

Read More
Allocating memory aligned buffers for SIMD; how does |16 give an odd multiple of 16, and why do it?...


c++dynamic-memory-allocationsimdmemory-alignmentmicro-optimization

Read More
How can I rearrange MIPS code to minimise the number of NOPs needed, by hand?...


assemblyoptimizationmipspipelinemicro-optimization

Read More
array_push() vs. $array[] = .... Which is fastest?...


phpmysqlmicro-optimization

Read More
Micro optimization: Returning from an inner block at the end of a function...


javascriptmicro-optimization

Read More
Weird performance effects from nearby dependent stores in a pointer-chasing loop on IvyBridge. Addin...


assemblyx86micro-optimizationmicrobenchmarkmicro-architecture

Read More
Can compilers ever optimize variables to use less than a byte of space?...


algorithmperformanceoptimizationbit-manipulationmicro-optimization

Read More
Loop optimization. How does register renaming break dependencies? What is execution port capacity?...


performanceoptimizationx86cpu-architecturemicro-optimization

Read More
Fast Way of Indexing Operator in Python (lambda i: l[i])...


pythonmicro-optimization

Read More
The advantages of using 32bit registers/instructions in x86-64...


gccassemblyx86-64micro-optimization

Read More
Transpile forEach, map, filter and for of to length based for loop in JavaScript...


javascriptwebpackmicro-optimization

Read More
Why are bitwise operators slower than multiplication/division/modulo?...


pythonoptimizationbitwise-operatorsmicro-optimization

Read More
Is it useful to check if a Java collection is empty before beginning iteration?...


javacollectionsgarbage-collectioniterationmicro-optimization

Read More
Address-size override prefix in 64-bit or using 64-bit registers...


assemblyx86-64micro-optimizationaddressing-mode

Read More
Assembly Jump with Multiple plus or do plus before jump (performance)...


performanceassemblyx86micro-optimization

Read More
Performance of assembly function with multiple RET...


performanceassemblyx86x86-64micro-optimization

Read More
Is there a penalty in having a non-aligned Jcc which is nearly never taken in Intel/AMD 64?...


loopsbranchx86-64memory-alignmentmicro-optimization

Read More
Fast method for testing a bit of a large int...


pythonpython-3.xoptimizationmicro-optimizationlargenumber

Read More
Is CMOVcc considered a branching instruction?...


assemblyx86-64cpu-architecturemicro-optimizationbranch-prediction

Read More
How can I resolve data dependency in pointer arrays?...


c++performancecompiler-optimizationmicro-optimization

Read More
Does Skylake need vzeroupper for turbo clocks to recover after a 512-bit instruction that only reads...


assemblyx86intelmicro-optimizationavx512

Read More
Why swap doesn't use Xor operation in C++...


c++swapxormicro-optimizationpremature-optimization

Read More
Ways to make a D program faster...


optimizationdmicro-optimizationdmdldc

Read More
Why does .NET Native compile loop in reverse order?...


c#assemblyx86micro-optimization.net-native

Read More
Micro-optimizing a linear search loop over a huge array with OpenMP: can't break on a hit...


cloopspthreadsopenmpmicro-optimization

Read More
What is the compiler doing here that allows comparison of many values to be done with few actual com...


c++assemblyoptimizationx86-64micro-optimization

Read More
BackNext