Search code examples
ccompiler-constructionrecordpascaltypechecking

C avoids cycles in type graphs by using structural equivalence for all types except records


Below is an excerpt from the red dragon book.

Consider a linked list of cells, each containing some integer information and a pointer to the next cell in the list. Pascal declarations of type names corresponding to links and cells are:

type link = ↑ cell; 
    cell = record 
               info : integer;
               next : link
           end; 

Note that the type name link is defined in terms of cell and that cell is defined in terms of link, so their definitions are recursive.

Recursively defined type names can be substituted out if we are willing to introduce cycles into the type graph. If pointer (cell) is substituted for link, the type expression shown in Fig. 6.8(a) is obtained for cell. Using cycles as in Fig. 6.8(b), we can eliminate mention of cell from the part of the type graph below the node labeled record.

figure

Example: C avoids cycles in type graphs by using structural equivalence for all types except records.

In C, the declaration of cell would look like

struct cell { 
      int info; 
      struct cell *next; 
 }; 

C uses the keyword struct rather than record, and the name cell becomes part of the type of the record. In effect, C uses the acyclic representation in Fig. 6.8(a).

C requires type names to be declared before they are used, with the exception of allowing pointers to undeclared record types. All potential cycles, therefore, are due to pointers to records. Since the name of a record is part of its type, testing for structural equivalence stops when a record constructor is reached - either the types being compared are equivalent because they are the same named record type or they are inequivalent. □

My doubts are as follows:

  1. Corresponding to the Pascal code written we could write a C code to mimic it as:
   typedef struct cell* link;
   struct cell{
         int info;
         link next;
   };

Now is it so that the above declaration using typedef shall be converted to the code given in the excerpt above (with the typedef being expanded out)?

  1. the name cell becomes part of the type of the record. Now if we declare the structure as:
   typedef struct {
         int info;
   }cell;

Though the above definition is not equivalent to the Pascal definition, but is the name cell still a part of the type of the record now?


Solution

  • In C, typedef names are aliases; that's the actual word used in the standard. An alias is a minor step up from a macro; aliases are scoped and they represent actual things (types, in this case), rather than an input stream which would be reinterpreted in context.

    So they are a bit more disciplined than macros, enough to be usable. But they are still entirely a label which in principle could be replaced throughout the code prior to compilation with the actual object they represent. (In the AST, that is. Obviously there is no way to do that substitution in the program text.)

    Thus, in C, two different aliases to the same type are the same type. The alias name has vanished; it was just a convenience for the programmer.

    That's not true of struct tags. Two struct types with different tags are different types even if their definitions are otherwise identical. The tag is part of the type's identity.

    One interesting case in C is composite types without tags. That's legal, but it acts as though each textual definition had a unique name inserted for the missing tag. Such a struct cannot be referred to twice, so although the following is legal, the function could not be called:

    int f(struct {int a;} x) { return x.a; }
    

    (Actually, that's a bit of a lie. You could call the function from a different file because cross-file calls are allowed to use compatible types. But that's a different discussion.)

    But if you give an untagged struct an alias name, you can use the alias as you wish. That is the difference between substituting the thing represented (a composite type) versus macro-style textual substitution. Thus, there is a big difference between these two

    typedef struct { int a; } T;
    #define T struct { int a; } // NEVER DO THIS