Search code examples
c++carraysvariable-assignmentstruct

Why do C and C++ support memberwise assignment of arrays within structs, but not generally?


I understand that memberwise assignment of arrays is not supported, such that the following will not work:

int num1[3] = {1,2,3};
int num2[3];
num2 = num1; // "error: invalid array assignment"

I just accepted this as fact, figuring that the aim of the language is to provide an open-ended framework, and lets the user decide how to implement something such as the copying of an array.

However, the following does work:

struct myStruct { int num[3]; };
struct myStruct struct1 = {{1,2,3}};
struct myStruct struct2;
struct2 = struct1;

The array num[3] is member-wise assigned from its instance in struct1, into its instance in struct2.

Why is member-wise assignment of arrays supported for structs, but not in general?

edit: Roger Pate's comment in the thread std::string in struct - Copy/assignment issues? seems to point in the general direction of the answer, but I don't know enough to confirm it myself.

edit 2: Many excellent responses. I choose Luther Blissett's because I was mostly wondering about the philosophical or historical rationale behind the behavior, but James McNellis's reference to the related spec documentation was useful as well.


Solution

  • Here's my take on it:

    The Development of the C Language offers some insight in the evolution of the array type in C:

    I'll try to outline the array thing:

    C's forerunners B and BCPL had no distinct array type, a declaration like:

    auto V[10] (B)
    or 
    let V = vec 10 (BCPL)
    

    would declare V to be a (untyped) pointer which is initialized to point to an unused region of 10 "words" of memory. B already used * for pointer dereferencing and had the [] short hand notation, *(V+i) meant V[i], just as in C/C++ today. However, V is not an array, it is still a pointer which has to point to some memory. This caused trouble when Dennis Ritchie tried to extend B with struct types. He wanted arrays to be part of the structs, like in C today:

    struct {
        int inumber;
        char name[14];
    };
    

    But with the B,BCPL concept of arrays as pointers, this would have required the name field to contain a pointer which had to be initialized at runtime to a memory region of 14 bytes within the struct. The initialization/layout problem was eventually solved by giving arrays a special treatment: The compiler would track the location of arrays in structures, on the stack etc. without actually requiring the pointer to the data to materialize, except in expressions which involve the arrays. This treatment allowed almost all B code to still run and is the source of the "arrays convert to pointer if you look at them" rule. It is a compatiblity hack, which turned out to be very handy, because it allowed arrays of open size etc.

    And here's my guess why array can't be assigned: Since arrays were pointers in B, you could simply write:

    auto V[10];
    V=V+5;
    

    to rebase an "array". This was now meaningless, because the base of an array variable was not a lvalue anymore. So this assigment was disallowed, which helped to catch the few programs that did this rebasing on declared arrays. And then this notion stuck: As arrays were never designed to be first class citized of the C type system, they were mostly treated as special beasts which become pointer if you use them. And from a certain point of view (which ignores that C-arrays are a botched hack), disallowing array assignment still makes some sense: An open array or an array function parameter is treated as a pointer without size information. The compiler doesn't have the information to generate an array assignment for them and the pointer assignment was required for compatibility reasons. Introducing array assignment for the declared arrays would have introduced bugs though spurious assigments (is a=b a pointer assignment or an elementwise copy?) and other trouble (how do you pass an array by value?) without actually solving a problem - just make everything explicit with memcpy!

    /* Example how array assignment void make things even weirder in C/C++, 
       if we don't want to break existing code.
       It's actually better to leave things as they are...
    */
    typedef int vec[3];
    
    void f(vec a, vec b) 
    {
        vec x,y; 
        a=b; // pointer assignment
        x=y; // NEW! element-wise assignment
        a=x; // pointer assignment
        x=a; // NEW! element-wise assignment
    }
    

    This didn't change when a revision of C in 1978 added struct assignment ( http://cm.bell-labs.com/cm/cs/who/dmr/cchanges.pdf ). Even though records were distinct types in C, it was not possible to assign them in early K&R C. You had to copy them member-wise with memcpy and you could pass only pointers to them as function parameters. Assigment (and parameter passing) was now simply defined as the memcpy of the struct's raw memory and since this couldn't break exsisting code it was readily adpoted. As a unintended side effect, this implicitly introduced some kind of array assignment, but this happended somewhere inside a structure, so this couldn't really introduce problems with the way arrays were used.