C array = array faster than memcpy()

I have a piece of C code which I am trying to optimise which involves setting an array a to b. I am currently using memcpy to achieve this, and it works, however it's not fast enough. I.e.

double a[4] = {1.0, 2.0, 3.0, 4.0};
double b[4];
memcpy(b, a, sizeof(a));

This is a basic example, my program is similar but uses up to 9000 doubles. I know that the use of pointers can save a lot of time, but I'm not sure how to do it. You're help is greatly appreciated.

EDIT: I don't need to keep the a array, that can be discarded. I just need to transfer from a to b.

Solution

I use the values in b to determine new values for a. It goes through a while loop checking for a convergence in data.

In that case you may be able to avoid any copying, if you switch the arrays back and forth, something like this (which is backwards from what you wrote; adjust as needed):

double array1[SIZE], array2[SIZE];
double* a = array1, double* b = array2;
generate_initial_values(array1);

for (;;)
{
    // do either 
    memcpy(b, a, sizeof array1); // sizeof either array will do; *don't* use sizeof a or b, which is only the size of the pointer, not of the array
    update_values_in_b(b);

    // or, better:
    produce_modified_values_in_b_from_a(a, b);

    if (converged(a, b)) break;
    // switch arrays
    double* temp_ptr = a;
    a = b;
    b = temp_ptr;
}

Doing it the second way will be faster if that works for you. If you must memcpy, you can try the stuff in Very fast memcpy for image processing?, but probably the best for you is to use memcpy and set the compiler's optimization level as high as possible. Be sure that you #include <string.h> and that the size argument to memcpy is a compile-time constant (it is above), and look at the generated assembly code to verify that the compiler is inlining the copy.

Edit: Wait, here's another thought, that doesn't even require switching arrays:

double a[SIZE], b[SIZE];
generate_initial_values(a);

for (;;)
{
    produce_modified_values_in_second_array_from_first(a, b);
    if (converged(a, b)) break;
    produce_modified_values_in_second_array_from_first(b, a);
    if (converged(b, a)) break;
}

When you exit the loop you don't know which array has the latest values, but if they've converged you probably don't care. if you do, you can set a pointer to the latest values, or use a function:

void calling_function(void)
{
    ...
    double a[SIZE], b[SIZE];
    generate_initial_values(a);
    double* great_values = get_great_values(a, b); // returns either a or b
    ...
}

double* get_great_values(double* a1, double* a2)
{
    for (;;)
    {
        produce_modified_values_in_second_array_from_first(a1, a2);
        if (converged(a1, a2)) return a2;
        produce_modified_values_in_second_array_from_first(a2, a1);
        if (converged(a2, a1)) return a1;
    }
}