Search code examples
cpointerscompiler-constructionbyte

How does compiler treat datatype?


If pointer points to a particular datatype, how does compiler knows (translates) all the properties (size,..., what are the others?) of that specific datatype?

If I have for example:

char* foo[] = {"abc", "123", "def"};, then the compiler must make double pointer (pointer to pointer) in order to have this array (of addresses) of arrays (of chars). But why is not rather datatype void* (as it is addressing -address datatype is void - to string), instead of datatype char?

As how i understand it right now - no matter how many address it has to dereference (char**, char***, ...), the datatype declared is for the VALUE finally find. So I understand it as a 'backstop' in the path, by finding the actual value by inspecting the final value size. So once the compiler derefence all the references - the path it makes - then it ONLY knows it find the value, because it is one byte long and terminated by null character - as char is (instead of continue dereferencing an address, which is 8 bytes long).

So the question is, how does compiler know a particular datatype. On what property does compiler decide what it is and how long it is. Does it make decision on something else? Or Is my conception correct?

PS: pointer arithmetic is not needed in this example.


Solution

  • ...then the compiler must make double pointer (pointer to pointer).

    Its not true that the compiler must create a char ** in this case. The compiler will simply treat char* foo[] = {"abc", "123", "def"}; as an array of 3 pointers, initialized with string literals.

    However, it is true that if this same variable were to be passed as a function argument, only the addresses of the first element of each array would be passed effectively resulting in your pointer to pointer assertion. Function prototypes that could pass this variable as an argument include the following:

    void func2(char *a[]);
    void func2(char *a[3]);
    void func3(char **a, int order);//includes size information
    

    So given foo as defined above, calls would be made as follows:

    func2(foo);
    func3(foo, 3);// requires size information
    

    Both instances of foo will be treated as char ** when processed by the called function.

    But why is not rather datatype void*...

    In a nutshell, void * is a way to allow type ambiguity on the programmer side of things. Conversely, when used correctly, there will be no ambiguity at compile time. (If there is, a good compiler will issue a warning, or error.).

    In C the keyword void is limited in its use:

    • when used as a function return type, it indicates that the function does not return a value.

    • When it appears in a pointer declaration, it specifies that the pointer is universal.

    • When used in a function's parameter list, void indicates that the function takes no parameters.

    The use list does not include creating a non-pointer variable, such as

    void variable = s;// Illegal, creates a compile-time error.  
    

    However, void * are useful for handling varying data types that might be passed as function parameters. For example, given:

    typedef union {
        int a;
        char b;
        float c;
        double d;
    } TYP; 
    TYP typ;
    
    enum {
        INT,
        CHAR,
        FLT,
        DBL
    }
    
    void func4(void *a, int type)// void * provides type ambiguity in argument 1 
    {                            //i.e. programmer can pass int, char or double
        switch(type) {           
            case 0://int
                typ.a = (int)a;
                break;
            case 1://char
                typ.b = (char)a;
                break;
            case 2://float
                typ.c = *(float *)(a);
                break;
            case 3://double
                typ.d = *(double *)(a);
                break;
        };
    }
    

    With calling examples:

    char   c = 'P';
    int    d = 1024;
    float  e = 14.5;
    double f = 0.0000012341;
    
    func4(&c, CHAR);
    func4(&d, INT);
    func4(&e, FLT);
    func4(&f, DBL);