Search code examples
cpointersgccstructansi-c

Pre-ANSI C use of struct selectors


A few years ago, before standardization of C, it was allowed to use struct selectors on addresses. For example, the following code was allowed and frequently used.

#define PTR 0xAA000
struct {  int integ; };

func() {
   int i;
   i = PTR->integ;    /* here, c is set to the first int at PTR */
   return c;
}

Maybe it wasn't very neat, but I like it. In my opinion, the power and the versatility of this language relies also on its lack of constraints. Nowadays, compilers just dump an error. I'd like to know if it is possible to remove this restraint in the GNU C compiler.

PS: similar code was used on the UNIX kernel by the inventors of C. (in V6, some dummy structures have been declared in param.h)


Solution

  • 'A few years ago' is actually a very, very long time ago. AFAICR, the C in 7th Edition UNIX™ (1979, a decade before the C89 standard was defined) didn't support that notation any more (but see below).

    The code shown in the question only worked when all structure members of all structures shared the same name space. That meant that structure.integ or pointer->integ always referred to an int at the start of a structure because there was only one possible structure member integ across the entire program.

    Note that in 'modern' C (1978 onwards), you cannot reference the structure type; there's neither a structure tag nor a typedef for it — the type is useless. The original code also references an undefined variable c.

    To make it work, you'd need something like:

    #define PTR 0xAA000
    struct integ {  int integ; };
    
    int func(void)
    {
       struct integ *ptr = (struct integ *)PTR;
       return ptr->integ;
    }
    

    C for 7th Edition UNIX

    I suggested that the C with 7th Edition UNIX supported separate namespaces for separate structure types. However, the C Reference Manual published with the UNIX Programmer's Manual Vol 2 mentions in §8.5 Structures:

    The names of structure members and structure tags may be the same as ordinary variables, since a distinction can be made by context. However, names of tags and members must be distinct. The same member name can appear in different structures only if the two members are of the same type and if their origin with respect to their structure is the same; thus separate structures can share a common initial segment.

    However, that same manual also mentions the notations (see also What does =+ mean in C):

    §7.14.2 lvalue =+ expression
    §7.14.3 lvalue =- expression
    §7.14.4 lvalue =* expression
    §7.14.5 lvalue =/ expression
    §7.14.6 lvalue =% expression
    §7.14.7 lvalue =>> expression
    §7.14.8 lvalue =<< expression
    §7.14.9 lvalue =& expression
    §7.14.10 lvalue =^ expression
    §7.14.11 lvalue = | expression

    The behavior of an expression of the form ‘‘E1 =op E2’’ may be inferred by taking it as equivalent to ‘‘E1 = E1 op E2’’; however, E1 is evaluated only once. Moreover, expressions like ‘‘i =+ p’’ in which a pointer is added to an integer, are forbidden.

    AFAICR, that was not supported in the first C compilers I used (1983 — I'm ancient, but not quite that ancient); only the modern += notations were allowed. In other words, I don't think the C described by that reference manual was fully current when the product was released. (I've not checked my 1st Edition of K&R — does anyone have one on hand to check?) You can find the UNIX 7th Edition manuals online at http://cm.bell-labs.com/7thEdMan/.