Search code examples
cstructansi-c

Have we always been able to access struct members from function calls in C?


Or, since which year/standard version/compiler/compiler-version... can we do this?

file: fretstruct.c

#include <stdio.h>
#include <stdlib.h>
#include <string.h>

/* this is global for MRFE simplicity - please don't do this. */
struct rectangle {
    int width;
    int height;
};


struct rectangle *create_rectangle(int width, int height) {
    struct rectangle *retrect = malloc(sizeof *retrect);
    retrect->width = width;
    retrect->height = height; 
    return retrect;
}

int main (void) {
    int rect_width = create_rectangle(20, 10)->width; /* needlessly leaking a block of memory */
    printf("width: %d\n", rect_width);

    printf("Press [RET] or [ENTER] key to exit the program...");
    getc(stdin); 

    return 0;
} 

Output:

width: 20
Press any key to exit the program...

I'm not surprised by the result by compiling against GNU17 (default)

gcc -Wall -o fretstruct fretscruct.c - Ubuntu 22.04 (Linux 5.19) GCC 12.2.0

gcc -Wall -o fretstruct.exe fretscruct.c - Windows 11 (NT) GCC 12.1.0

However I did NOT expect the same result from compiling against C89 (ANSI C)

gcc -Wall -std=c89 -pedantic -o fretstruct fretstruct.c - Ubuntu 22.04 (Linux 5.19) GCC 12.2.0

gcc -Wall -std=c89 -pedantic -o fretstruct.exe fretstruct.c - Windows 11 (NT) GCC 12.1.0

But with C89 I got no compile errors and the exact same output.

Now, I know that in C function calls evaluate to their return value in expressions, but in this case, it seems wasteful, the reference to the struct is lost along with the stack frame that created it, so we have no way of getting to the actual struct, different is if we created a statically allocated struct, pass it to the function and return it with the modified values, then the struct will still be accessible, but, what about the the modified values?, cause doing this (accessing a single member from a function call) we discarded the copy that was returned to us, and with that copy, all modifications to it's members with the exception of the member we accessed through the call. And I hope you initialize the statically allocated struct, otherwise it would certainly contain garbage data on its members.

It seems to have some use when used with 'informative' functions for example:

Let's say we have an array of rectangles (rect_array), and this function:

struct rectangle *get_rect_at(struct rectangle *rect, size_t index) {       /* No pun intended ;) */
    return rect + index;      
}

We could have a printf() call like this:

int index = 2;
printf("The rectangle at index '%d' is: (%d, %d).", index, get_rect_at(rect_array, index)->width, get_rect_at(rect_array, index)->height);

Or, since encapsulation is not exactly a C thing (no accessors needed), maybe we can skip the function already and just access the member directly.

int index = 2;
printf("The rectangle at index '%d' is: (%d, %d).", index, rect_array[index].width, rect_array[index].height);

However, it may be a good idea to access an element of an array from a function, you could add code to check the index is in bound inside the function but in this case, you may still want to check the return value of the function (against NULL for example) before trying to access the value, or check for another anomalies, in both cases, an immediate access to a member of a struct is not going to be a good idea, or even guaranteed; I could add the full code of this example if by doing that it clarifies my point.

So here's the complete and somehow simplified example:

#include <stdio.h>
#include <stdlib.h>
#include <string.h>

/* this is global for MCFE simplicity - please don't do this. */
struct rectangle {
    int width;
    int height;
};


struct rectangle *create_rectangle(int width, int height) {
    struct rectangle *retrect = malloc(sizeof *retrect);
    retrect->width = width;
    retrect->height = height; 
    return retrect;
}

struct rectangle *create_rect_arr(int *width_arr, int *height_arr, size_t nelems) {
    struct rectangle *rect_arr = malloc(sizeof(struct rectangle *) * nelems);
    int i;      /* For our loop, C90 whines about declaration inside the loop (the specifications) */
    for(i = 0; i < nelems; i++) {
        (rect_arr + i)->width = *(width_arr + i);
        (rect_arr + i)->height = *(height_arr + i); 
    }  

    return rect_arr;
} 

struct rectangle *get_rect_at(struct rectangle *rect, size_t array_len, size_t index) {       /* No pun intended ;) */
    return (index < array_len) ? rect + index : NULL;     
}

int main (void) {
    /* declarations block (for pedantic C89-C90) */
    int rect_width = create_rectangle(20, 10)->width; /* needlessly leaking a block of memory */
    int width_array[] = {2, 4, 8, 8, 9, 16};
    int height_array[] = {8, 4, 4, 10, 12, 16};
    size_t width_arr_len = (sizeof width_array) / (sizeof width_array[0]);
    size_t height_arr_len = (sizeof height_array) / (sizeof height_array[0]);

    printf("width: %d\n", rect_width);

    if(width_arr_len - height_arr_len) {
        printf("the lenghts of \'width_array\' and \'height_array\' are not the same!");
        /* return 1;   // Uncomment if your done with the program if this is a no-deal */
    } else {
        size_t arr_len = width_arr_len;
        struct rectangle *rect_array = create_rect_arr(width_array,
                                                       height_array, 
                                                       arr_len);
        if(rect_array) {
            int first_square_index = 1;
            int last_square_index = 5;
            /* let's still avoid try to dereference a null pointer. */
            if(first_square_index < arr_len && 
               last_square_index < arr_len) {       
                printf("first square is: (%d, %d)\n",
                       get_rect_at(rect_array, arr_len,
                       first_square_index)->width,
                       get_rect_at(rect_array, arr_len,
                       first_square_index)->height);
                printf("last square is: (%d, %d)\n",
                       get_rect_at(rect_array, arr_len,
                       last_square_index)->width,
                       get_rect_at(rect_array, arr_len,
                       last_square_index)->height);

                printf("Or, without the function:\n");
                printf("first square is: (%d, %d)\n",
                       rect_array[first_square_index].width,
                       rect_array[first_square_index].height);
                printf("last square is: (%d, %d)\n", 
                       rect_array[last_square_index].width, 
                       rect_array[last_square_index].height);
            } else {
                printf("Either 'first_square_index' or 'last_square_index' - or both, are out of the boundaries of the array.");
            }
                        
        } else {
            printf("Couldn't allocate enought memory for \'rect_array\'\n");
        } 
    }

    printf("Press [RET] or [ENTER] key to exit the program...");
    getc(stdin); 

    return 0;

It makes use of this way to access struct members from function calls in different contexts, and with more or less impact on 'unfreeable' memory chunks (sorry for my bastardization of the english language).

So, there's that. to reiterate, my principal question (for the veterans of the C language) is: Have we always been able to do this? or, Since when?. And you wan't to add this to your answer, or add a comment, is it really useful?.

Thanks in advance.


Solution

  • To answer your first question, since when can we directly access a member of a struct returned from a function call, like the one you shown in your example:

    It has been a feature of the C including in ANSI C (C89/C90) and subsequent versions.

    Reference: https://web.archive.org/web/20200909074736if_/https://www.pdf-archive.com/2014/10/02/ansi-iso-9899-1990-1/ansi-iso-9899-1990-1.pdf

    Your second question, I think it's more about what is the good programming practices when dealing with encapsulation and proper memory management. Especially when working with structs in C.

    As you mentioned above, accessing the members of structs returned from function calls directly, would indeed lead to potential memory issues and undefined behaviour. So usually it is a good practice to use accessor functions or accessing members through pointers to access struct.