memory fences in subfunction vs in the same function as the data change

Are there differences in thread safety if I place memory fences in sub functions rather than into the function that the data is used. The bottom example includes both versions. I wonder if there are differences that i am not aware of. Are the functions A_function and B_function equally threadsafe?.

#include<atomic>

using std::atomic;
using std::atomic_thread_fence;
using std::memory_order_acquire;
using std::memory_order_release;

typedef struct 
{
    atomic<int> lock;
    int counter;
}Data;

void A_acquire(atomic<int> * lock);
void A_release(atomic<int> * lock);
void A_function(Data * data);
void B_acquire(atomic<int> * lock);
void B_release(atomic<int> * lock);
void B_function(Data * data);

void A_acquire(atomic<int> * lock)
{
    int ticket = lock->fetch_add(1);
    while (0 != ticket)
    {
        lock->fetch_sub(1);
        ticket = lock->fetch_add(1);
    }
    //DIFFERENCE HERE
}

void A_release(atomic<int> * lock)
{
    //DIFFERENCE HERE
    lock->fetch_sub(1);
}

void A_function(Data * data)
{
    A_acquire(&data->lock);
    atomic_thread_fence(std::memory_order_acquire); //DIFFERENCE HERE
    data->counter += 1;
    atomic_thread_fence(std::memory_order_release); //DIFFERENCE HERE
    A_release(&data->lock);
}

void B_acquire(atomic<int> * lock)
{
    int ticket = lock->fetch_add(1);
    while (0 != ticket)
    {
        lock->fetch_sub(1);
        ticket = lock->fetch_add(1);
    }
    atomic_thread_fence(std::memory_order_acquire); //DIFFERENCE HERE
}

void B_release(atomic<int> * lock)
{
    atomic_thread_fence(std::memory_order_release); //DIFFERENCE HERE
    lock->fetch_sub(1);
}

void B_function(Data * data)
{
    B_acquire(&data->lock);
    //DIFFERENCE HERE
    data->counter += 1;
    //DIFFERENCE HERE
    B_release(&data->lock);
}

int main(void)
{
    Data dat = { 0, 0 };
    A_function(&dat);
    B_function(&dat);
    return 0;
}

Solution

There is semantically no difference between A_function and B_function. Memory fence effect is not bounded by the function's body.

Also, as Phantom notes, memory fences in your example are unneccessary: both fetch_sub() and fetch_add() already have acquire + release semantic.

But with modifications below, release fetch becomes vital:

void A_acquire(atomic<int> * lock)
{
    int ticket = lock->exchange(1);
    while (0 != ticket)
    {
        ticket = lock->exchange(1);
    }
    //DIFFERENCE HERE
}

void A_release(atomic<int> * lock)
{
    //DIFFERENCE HERE
    lock->store(0, memory_order_relaxed);
}