Search code examples
cwindowsassemblythread-localthread-local-storage

How Do I Access Thread Local Storage From ml64.exe (MSVC 64-bit X64 Assembler)?


The following C function attempts to prevent recursion in multicore code in a thread-safe manner using a thread local storage variable. However, for reasons that are somewhat complicated, I NEED to write this function in X64 assembler (Intel X86 / AMD 64-bit) and assemble it with ml64.exe from VC2010. I know how to do this if I'm using global variables but I'm not sure how to do it properly with a TLS variable that has __declspec(thread).

__declspec(thread) int tls_VAR = 0;
void norecurse(  )
{
    if(0==tls_VAR)
    {
        tls_VAR=1;
        DoWork();
        tls_VAR=0;
    }
}

Note: This is what VC2010 kicks out for the function. However, MASM (ml64.exe) doesn't support the gs:88 or OFFSET FLAT: parts of the code.

; Listing generated by Microsoft (R) Optimizing Compiler Version 16.00.40219.01 

include listing.inc

INCLUDELIB MSVCRTD
INCLUDELIB OLDNAMES

PUBLIC  norecurse
EXTRN   DoWork:PROC
EXTRN   tls_VAR:DWORD
EXTRN   _tls_index:DWORD
pdata   SEGMENT
$pdata$norecurse DD imagerel $LN4
    DD  imagerel $LN4+70
    DD  imagerel $unwind$norecurse
pdata   ENDS
xdata   SEGMENT
$unwind$norecurse DD 040a01H
    DD  06340aH
    DD  07006320aH
; Function compile flags: /Ogtpy
xdata   ENDS
_TEXT   SEGMENT
norecurse PROC
; File p:\hackytests\64bittest2010\64bittest\64bittest.cpp
; Line 19
$LN4:
    mov QWORD PTR [rsp+8], rbx
    push    rdi
    sub rsp, 32                 ; 00000020H
; Line 20
    mov ecx, DWORD PTR _tls_index
    mov rax, QWORD PTR gs:88
    mov edi, OFFSET FLAT:tls_VAR
    mov rbx, QWORD PTR [rax+rcx*8]
    cmp DWORD PTR [rbx+rdi], 0
    jne SHORT $LN1@norecurse
; Line 22
    mov DWORD PTR [rbx+rdi], 1
; Line 23
    call    DoWork
; Line 24
    mov DWORD PTR [rbx+rdi], 0
$LN1@norecurse:
; Line 26
    mov rbx, QWORD PTR [rsp+48]
    add rsp, 32                 ; 00000020H
    pop rdi
    ret 0
norecurse ENDP
_TEXT   ENDS
END

Solution

  • As your answer indicates the problem comes down finding the MASM equivalents to the following two lines in assembly listing generated by the Microsoft's C++ compiler:

    mov rax, QWORD PTR gs:88
    mov edi, OFFSET FLAT:tls_VAR
    

    The first line is easy. Just replace gs:88 with gs:[88].

    The second line is less obvious. The OFFSET FLAT: operator is a red herring. It means use the offset relative to the beginning of the "FLAT" segment. With the 32-bit version of MASM, the FLAT segment is the segment that includes the entire 4G address space. This is the segment that's used for both the code and data segment as part of the 32-bit flat memory model. The 64-bit version of MASM doesn't support memory models, it essentially always assumes a 64-bit version of the flat memory model, so it doesn't support the FLAT keyword. As result the plain OFFSET operator ends meaning the same thing. (In fact with the 32-bit assembler, plain OFFSET also normally means the same thing because PECOFF only supports the flat memory model.)

    However using OFFSET here won't work. That's because it would use the offset of the address of tls_VAR in memory relative to address 0. Or in other words, it would use the absolute address of tls_VAR in memory. What's needed here is the offset relative to the beginning of the TLS data section.

    So the compiler must be doing something special here. In order find out, I dumped the relocations in the object file generated while compiling your example C code:

    > dumpbin /relocations t215a.obj
    ...  
    RELOCATIONS #4
                                                    Symbol    Symbol
     Offset    Type              Applied To         Index     Name
     --------  ----------------  -----------------  --------  ------
     00000008  REL32                      00000000        14  _tls_index
     00000016  SECREL                     00000000         8  tls_VAR
     0000002D  REL32                      00000000         C  DoWork
    ...
    

    As you can see it generates a relocation of type SECREL for the reference to tls_VAR. This makes the relocation relative to the base of the section in the generated executable that that symbol appears in. In this case that's the .tls section, so this relocation generates an offset relative to the beginning of the section used for static TLS data.

    So now the question becomes how to get MASM to generate the same SECREL relocation the compiler emits. This turns out to have a easy solution as well, just replace OFFSET FLAT: with SECTIONREL.

    So with these changes (and a bit of optimization) your function becomes:

        EXTERN  tls_VAR:DWORD
        EXTERN  _tls_index:DWORD
        EXTERN  DoWork:PROC
    
        PUBLIC  norecurse
    _TEXT SEGMENT
    norecurse PROC
        push rbx
        sub rsp, 32
        mov rax, gs:[88]
        mov ecx, _tls_index
        mov rbx, [rax + rcx * 8]
        cmp DWORD PTR [rbx + SECTIONREL tls_VAR], 0
        jne return
        mov DWORD PTR [rbx + SECTIONREL tls_VAR], 1
        call DoWork
        mov DWORD PTR [rbx + SECTIONREL tls_VAR], 0
    return:
        add rsp, 32
        pop rbx
        ret
    norecurse ENDP
    _TEXT ENDS
        END