Search code examples
cstringc99ansi-c

How do strings work in C?


String is said to be a constant in C programming language.

So, when I give a statement like char *s = "Hello", I have learned that s points to a memory location of H since "Hello" is stored in some static memory of the program and also "Hello" is immutable.

Does it mean the variable s is now a variable of type pointer to constant data such as const int a = 3;const int *i = &a;. This seems so because I can't manipulate the data (when I do, it results in segmentation fault).

But, if it is so, shouldn't compiler be able to detect and say that I have assigned qualified data to unqualified variable. Something like char *p p is a pointer to unqualified character and when I say char *p="Hello" p, the pointer to unqualified character can't point to a const character type?

What am I missing here?

If it is not the case as above, then how is an array of constant characters made immutable?


Solution

  • First of all, a string in C isn't immutable. C doesn't even know a type for strings -- a string is just defined as a sequence of char ending with '\0'.

    What you're talking about are string literals and they can be immutable. The C standard defines that attempting to modify a string literal is undefined behavior, still their type is char *. So, if you are sure that in your implementation of C, a string literal is writable, you can do so! *)

    But your code won't be well-defined C any more and won't work on other platforms with read-only string literals. It will compile, because writing through char * is perfectly fine, but fail at runtime in unpredictable ways (like, possibly, a crash).

    Therefore, it's just best practice for portable code to assign string literals only to const char * pointers and, if you need a mutable string, use the string literal as an initializer for a char [].


    *) beware this is very uncommon, you'll find it nowadays only with specialized compilers targeting embedded or very old platforms. A modern platform will place string literals in a read-only data segment or similar.