Search code examples
gpusseavxavx2boinc

Ubuntu - how to tell if AVX or SSE, is current being used by CPU app?


I current run BOINC across a number of servers which have GPUs.

The servers run both GPU and CPU BOINC apps.

As AVX and SSE slow down the CPU freq when being used within a CPU app, I have to be selective which CPU/GPU I run together, as some GPU apps get bottle necked (slower run time completion) where as others do not.

At present some CPU apps are named so it is clear to see if they use AVX but most are not.

Therefore is there any command I can run, and some way of viewing, to see if any of the CPU apps currently running are using AVX or SSE (any versions)?

Also as a side note, should I treat any FMA usage in the same way (eg does it slow down the CPU freq due to increased CPU temps)?

Thanks


Solution

  • You can use perf top to see the number of AVX and SSE instructions executed in real-time along with executable and shared library names:

    perf top -e fp_arith_inst_retired.128b_packed_single -e fp_arith_inst_retired.128b_packed_double -e fp_arith_inst_retired.256b_packed_single -e fp_arith_inst_retired.256b_packed_double
    

    Counter descriptions (from perf list output on Intel Coffee Lake CPU):

    floating point:
      fp_arith_inst_retired.128b_packed_double          
           [Number of SSE/AVX computational 128-bit packed double precision floating-point instructions retired. Each count represents 2 computations. Applies to SSE* and AVX*
            packed double precision floating-point instructions: ADD SUB MUL DIV MIN MAX SQRT DPP FM(N)ADD/SUB. DPP and FM(N)ADD/SUB instructions count twice as they perform
            multiple calculations per element]
      fp_arith_inst_retired.128b_packed_single          
           [Number of SSE/AVX computational 128-bit packed single precision floating-point instructions retired. Each count represents 4 computations. Applies to SSE* and AVX*
            packed single precision floating-point instructions: ADD SUB MUL DIV MIN MAX RCP RSQRT SQRT DPP FM(N)ADD/SUB. DPP and FM(N)ADD/SUB instructions count twice as they
            perform multiple calculations per element]
      fp_arith_inst_retired.256b_packed_double          
           [Number of SSE/AVX computational 256-bit packed double precision floating-point instructions retired. Each count represents 4 computations. Applies to SSE* and AVX*
            packed double precision floating-point instructions: ADD SUB MUL DIV MIN MAX SQRT DPP FM(N)ADD/SUB. DPP and FM(N)ADD/SUB instructions count twice as they perform
            multiple calculations per element]
      fp_arith_inst_retired.256b_packed_single          
           [Number of SSE/AVX computational 256-bit packed single precision floating-point instructions retired. Each count represents 8 computations. Applies to SSE* and AVX*
            packed single precision floating-point instructions: ADD SUB MUL DIV MIN MAX RCP RSQRT SQRT DPP FM(N)ADD/SUB. DPP and FM(N)ADD/SUB instructions count twice as they
            perform multiple calculations per element]
      fp_arith_inst_retired.scalar_double               
           [Number of SSE/AVX computational scalar double precision floating-point instructions retired. Each count represents 1 computation. Applies to SSE* and AVX* scalar double
            precision floating-point instructions: ADD SUB MUL DIV MIN MAX SQRT FM(N)ADD/SUB. FM(N)ADD/SUB instructions count twice as they perform multiple calculations per element]
      fp_arith_inst_retired.scalar_single               
           [Number of SSE/AVX computational scalar single precision floating-point instructions retired. Each count represents 1 computation. Applies to SSE* and AVX* scalar single
            precision floating-point instructions: ADD SUB MUL DIV MIN MAX RCP RSQRT SQRT FM(N)ADD/SUB. FM(N)ADD/SUB instructions count twice as they perform multiple calculations
            per element]
      fp_assist.any                                     
           [Cycles with any input/output SSE or FP assist]