Search code examples
cstringscopeliteralsstring-literals

Scope of (string) literals


I always try to avoid to return string literals, because I fear they aren't defined outside of the function. But I'm not sure if this is the case. Let's take, for example, this function:

const char *
return_a_string(void)
{
    return "blah";
}

Is this correct code? It does work for me, but maybe it only works for my compiler (gcc). So the question is, do (string) literals have a scope or are they present/defined all the time.


Solution

  • This code is fine across all platforms. The string gets compiled into the binary as a static string literal. If you are on windows for example you can even open your .exe with notepad and search for the string itself.

    Since it is a static string literal scope does not matter.

    String pooling:

    One thing to look out for is that in some cases, identical string literals can be "pooled" to save space in the executable file. In this case each string literal that was the same could have the same memory address. You should never assume that it will or will not be the case though.

    In most compilers you can set whether or not to use static string pooling for stirng literals.

    Maximum size of string literals:

    Several compilers have a maximum size for the string literal. For example with VC++ this is approximately 2,048 bytes.

    Modifying a string literal gives undefined behavior:

    Modifying a string literal should never be done. It has an undefined behavior.

    char * sz = "this is a test";
    sz[0] = 'T'; //<--- undefined results
    

    Wide string literals:

    All of the above applies equally to wide string literals.

    Example: L"this is a wide string literal";

    The C++ standard states: (section lex.string)

    1 A string literal is a sequence of characters (as defined in lex.ccon) surrounded by double quotes, optionally beginning with the letter L, as in "..." or L"...". A string literal that does not begin with L is an ordinary string literal, also referred to as a narrow string literal. An ordinary string literal has type "array of n const char" and static storage duration (basic.stc), where n is the size of the string as defined below, and is initialized with the given characters. A string literal that begins with L, such as L"asdf", is a wide string literal. A wide string literal has type "array of n const wchar_t" and has static storage duration, where n is the size of the string as defined below, and is initialized with the given charac- ters.

    2 Whether all string literals are distinct (that is, are stored in nonoverlapping objects) is implementation-defined. The effect of attempting to modify a string literal is undefined.