Search code examples
Assembly handwritten function slower than GCC compiled function...

assemblyx86-64cpu-architecturememory-alignmentmicro-optimization

Read More
Assembly function's data arrangment in data section...

assemblyx86-64cpu-architecturemicro-optimization

Read More
ADD slower than ADC in the first step of a bigint multiply on Coffee Lake (Skylake)...

performanceassemblyx86cpu-architecturemicro-optimization

Read More
AND + CMP or SHR + CMP?...

coptimizationcpu-architectureportabilitymicro-optimization

Read More
x86-64 instruction to AND until zero?...

assemblybit-manipulationx86-64micro-optimization

Read More
Is it possible to get the unsigned quotient and remainer at once in C?...

cperformancedivisionmicro-optimizationunsigned-integer

Read More
Which Intel microarchitecture introduced the ADC reg,0 single-uop special case?...

performanceassemblyx86intelmicro-optimization

Read More
How to strip debug symbols for real in Xcode?...

xcodemacosstripmicro-optimizationdebug-symbols

Read More
"enter" vs "push ebp; mov ebp, esp; sub esp, imm" and "leave" vs &quot...

assemblyx86stackmicro-optimizationstack-frame

Read More
How to properly increment some array key, even if key needs to be created?...

phpoptimizationmicro-optimization

Read More
Mixing SSE with AVX128 for shorter instructions?...

assemblyx86sseavxmicro-optimization

Read More
Is it useful to use VZEROUPPER if your program+libraries contain no SSE instructions?...

performanceassemblyx86avxmicro-optimization

Read More
C optimization: conditional store to avoid dirtying a cache line...

ccachingcpu-cachemicro-optimizationlibuv

Read More
Setting and clearing the zero flag in x86...

performanceassemblyx86x86-64micro-optimization

Read More
How can the rep stosb instruction execute faster than the equivalent loop?...

performanceassemblyoptimizationx86micro-optimization

Read More
Why are loops always compiled into "do...while" style (tail jump)?...

performanceloopsassemblyoptimizationmicro-optimization

Read More
Shorter x86 call instruction...

assemblyx86callmicro-optimizationmachine-code

Read More
Are there any efficient micro-optimizations to find the number of unique grid paths?...

javamicro-optimization

Read More
Is this a missed optimization in GCC, loading an 16-bit integer value from .rodata instead of an imm...

cgccx86-64compiler-optimizationmicro-optimization

Read More
How is a critical path formed when there is a data dependency between a loop iterations while a CPU ...

performanceassemblyx86-64cpu-architecturemicro-optimization

Read More
Why is POP slow when using register R12?...

performancex86intelcpu-architecturemicro-optimization

Read More
Is it more efficient to multiply within the address displacement or outside it?...

assemblyoptimizationx86micro-optimizationaddressing-mode

Read More
what is faster: in_array or isset?...

phpperformancemicro-optimization

Read More
What is faster in Python, "while" or "for xrange"...

pythonmicro-optimization

Read More
Should I use Java's String.format() if performance is important?...

javastringperformancestring-formattingmicro-optimization

Read More
How to unroll a loop of a dot product in mips after re-ordering instructions?...

assemblymipscpu-architecturemicro-optimizationloop-unrolling

Read More
Most compact way to test for a negative number in x86 assembly?...

assemblyx86micro-optimization

Read More
test $x,%dil vs. test $x,%edi...

assemblyoptimizationx86-64attmicro-optimization

Read More
Why should code be aligned to even-address boundaries on x86?...

assemblyx86memory-alignmentmicro-optimization

Read More
Any possible code that can flip a bit/integer/bool between 0 and 1 in single CPU instruction...

c++cassemblyx86micro-optimization

Read More
BackNext