Search code examples
c

In C,the strings that have indentical content are always stored in identical memory?


I got this code:

clock_t record[2][15];
char *blockName[15];
int count = 0;

#define BEGIN(block_name) \
        do{ \
                blockName[count] = #block_name;\
                record[0][count++] = clock(); \
        }while( 0 ) \

#define END(block_name) \
        do{ \
            for( int i = 0; i < count; i++ ) \
                if( #block_name == blockName[i] ){ \
                    record[1][i] = clock(); \
            break; \
                } \
        }while( 0 ) \

#define RESULT \
        do{ \
            for( int i = 0; i < count; i++ ) \
                printf( "block %s costs %f seconds\n", blockName[i], \
                (double)(record[1][i]-record[0][i])/ CLOCKS_PER_SEC ); \
        }while( 0 ) \

When we compile the code with -Wall on , I got the warning:
warning: comparison with string literal results in unspecified behavior [-Waddress]
I know a trick that the strings that contain identical content are stored in identical memory.
So I write the line if(#block_name == blockName[i]) to compare two strings. But I do not know whether it is always the case.
Does the warning mean that the trick doesn`t work on all platform?


Solution

  • No, that's really up to the compiler whether identical strings literals will point to the same addresses once compiled (and therefore undefined behavior). So a simple address comparison typically is not enough to ensure two strings are the same, even if both are hardcoded in the same source file.

    For example, Microsoft calls this "String Pooling" and allows you to specifically enable or disable it.

    Take the following simple code as an example:

    const char *text1 = "Hello World!";
    const char *text2 = "Hello World!";
    
    printf("Hello World!");
    

    Without the optimization, it's possible that the resulting executable code might include up to three instances of the string "Hello World!".

    Depending on the compiler (and in the case of MSVC whether "String Pooling" is enabled or not) you could end up with to different memory layouts:

    • Optimized for size (and/pr with string pooling enabled):

      (other data)Hello World!\0(other data)

    • Not optimized for size (and/or with string pooling disabled):

      (other data)Hello World!\0Hello World!\0Hello World!\0(other data)

    Of course the actual layout might still differ a bit. There could be other memory inbetween (even if it's just for padding).

    In general as an advice: Never, ever assume anything about memory alignment or memory addresses, unless you specifically defined the layout (e.g. as part of a struct).


    As for your actual code problem, I assume you want some easy way to measure times? If so, why don't you use concatenation (timer ## block_name) to create your values on the run?