Search code examples
arrayscstringnul

Why is there a disparity b/w array of chars and string in C?


I noticed that if you define a string in C you need to take into consideration the nul terminator, which made intuitive sense that the computer need someway to check that the string has hit its end.

But then when you do something like anytype arr[] = {val1, val2, ...}, you suddenly don't need to take into consideration the nul terminator.

How does the computer know that it has hit the end of the array without any special identifier.

i tried searching but the answers were not specific to how does the computer know the end of an array.


Solution

  • Unlike other languages, in C, arrays don't include their inherent sizes. Without any checks, it's entirely possible to read or write outside of an array bounds. In modern systems, there can be some protection provided by segmentation and paged virtual memory, but it will still let you read outside of the array within the memory owned by the process.

    To answer your question:

    The \0 character automatically added to the end of strings is a standard terminator used by many C functions. It's so important that the UTF-8 standard was designed to be compatible with it by explicitly forbidding overlong encodings.

    However, there is no standard terminator for the ends of other types of arrays like integers, floating point, structures, etc. In these cases, you have to provide the size of the array so that the called function won't access out of bounds.

    In some cases, you may know that certain values cannot exist, so for example, you could use -1 to indicate the end of an array of integers. Any design decisions here would be specific to the application.