Search code examples
How to use Fused Multiply-Add (FMA) instructions with SSE/AVX...

cssecpu-architectureavxfma

Read More
Floating point vs integer calculations on modern hardware...

c++x86floating-pointx86-64cpu-architecture

Read More
Are SIMD and VLIW instructions the same thing?...

x86cpu-architecturesimdinstruction-setvliw

Read More
Why can't we have a safe ISA?...

securitymemory-managementcpu-architectureinstruction-setmemory-safety

Read More
Which factors affect the needed compiler?...

compilationoperating-systemcpucpu-architectureinstruction-set

Read More
Detecting architecture at compile time from MASM/MASM64...

assemblyx86-64cpu-architecturemasmmasm32

Read More
Do any CPU architectures use Metadata?...

cpu-architecture

Read More
What do the letters in port usage on uops.info mean?...

x86cpucpu-architectureintel

Read More
Determine target ISA extensions of binary file in Linux (library or executable)...

linuxshared-librariesexecutablecpu-architectureinstruction-set

Read More
Do assembly instructions map 1-1 to machine language?...

assemblycpu-architecturemachine-code

Read More
Slowing down CPU Frequency by imposing memory stress...

c++linuxcpuintelcpu-architecture

Read More
If cache invalidation happens every time memory mappings change, why not opt for VIVT?...

cachingx86cpucpu-architecturecpu-cache

Read More
Can addition be done in less than a cycle when outputs depend on each other?...

assemblyx86cpu-architectureintel

Read More
How do modern Intel x86 CPUs implement the total order over stores...

x86intelcpu-architecturememory-barriersmesi

Read More
Understanding synchronization with multiple processors...

javamultithreadingcpu-architectureatomiccompare-and-swap

Read More
Difference between low and high 8-bit registers; do their values use bits in opposite bit-endian ord...

x86cpu-architecturecpu-registers

Read More
How to start learning assembly language on any system...

assemblycpu-architectureportabilityplatform-independent

Read More
Was there any advantage to the 386 architecture making 16-bit register arithmetic leave upper bits u...

assemblyx86cpu-architecturehardwarecpu-registers

Read More
Was there a P4 model with double-pumped 64-bit operations?...

x86x86-64intelcpu-architecture

Read More
Atomicity of loads and stores on x86...

c++x86cpu-architectureatomicmemory-barriers

Read More
optimal to flush low-contention atomic from caches?...

multithreadingcpu-architectureatomiccpu-cachemesi

Read More
CPU operations during g++ compiling...

compilationg++cpucpu-architecturebuild-server

Read More
Are programs compiled for RV32E guaranteed to produce equivalent results on RV32I machines?...

assemblycpu-architecturecpu-registersriscv

Read More
Why does floating-point output differ across platforms?...

javajdbcfloating-pointcpu-architectureieee-754

Read More
How do I force the CPU to perform in order execution of a program without any loops or branches?...

gccx86cpucpu-architecture

Read More
What's the purpose of the rotate instructions (ROL, RCL on x86)?...

assemblyx86cpu-architecturebit-shiftinstruction-set

Read More
Does INVLPG instruction or mprotect() affect the CPU cache state while invalidating TLB entries?...

assemblyx86cpu-architecturecpu-cachetlb

Read More
AVX2 / gcc: Improve CPU-level parallelism by using different registers...

gccvectorizationcpu-architecturesimdavx2

Read More
How much of ‘What Every Programmer Should Know About Memory’ is still valid?...

optimizationmemoryx86cpu-architecturecpu-cache

Read More
What does memory_order_consume really do?...

c++cpu-architecturelock-freememory-modelstdatomic

Read More
BackNext