I have a function taking an array of strings as a parameter:
int parseArguments(int argc, const char* const* argv);
I want to mark everything as const
to allow any input argument for this function, constant or not. Now, I would like to call this function with main
's argv
:
int main(int argc, char** argv)
{
parseArguments(argc, (const char* const*)argv);
}
I know that passing a pointer to an unqualified type where a pointer to a const-qualified type is expected is fine (e.g. passing char*
to a function expecting const char*
) and that the implicit conversion does not apply to nested pointers (e.g. passing char**
to a function expecting const char**
raises a warning). I am also aware of the example in the standard explaining the reason for this:
const char** cpp;
char* p;
const char c = ’A’;
cpp = &p; // constraint violation
*cpp = &c; // valid
*p = 0; // valid
// this code would end up modifying the value of a constant object
and I do not think it applies here since I am only passing the nested pointer to the function.
What I would like to know and that I could not really find in the C standard is if doing the explicit cast above to "add nested const qualification" is a valid operation or if I would have to do something in the lines of
size_t s = argc * sizeof(argv[0]);
const char** arr = malloc(s);
// error checking
memcpy(arr, argv, s);
I could also directly define main
to have constant arguments (int main(int argc, const char** argv)
) but I would prefer to keep my program strictly conforming.
What I would like to know and that I could not really find in the C standard is if doing the explicit cast above to "add nested const qualification" is a valid operation
The general rule is this:
A pointer to an object type can be converted to a pointer to a different object type. If the resulting pointer is not correctly aligned for the referenced type, the behavior is undefined. Otherwise, when converted back again, the result shall compare equal to the original pointer.
(C23 6.3.3.3/7)
There are some special cases, too, but none of them apply to your example.
It is technically possible that the kind of cast you propose could, in some circumstances, run afoul of the alignment provision, but I don't think that's anything to fear in practice.
However, that addresses only the conversion itself and its reverse. There is nothing there about accessing the pointed-to object via the converted pointer, which I presume you want to be able to do. For the particular kind of conversion you describe, that question comes down to the strict-aliasing rule (C23 6.5.1/8):
An object shall have its stored value accessed only by an lvalue expression that has one of the following types:
a type compatible with the effective type of the object,
a qualified version of a type compatible with the effective type of the object,
[a few other type categories not relevant to the present question ...]
(For objects with declared types, such as in your example, "effective type" is the same as declared type.)
Let's consider your example:
int parseArguments(int argc, const char* const* argv);
[...]
int main(int argc, char** argv) { parseArguments(argc, (const char* const*)argv); }
What can parseArguments()
definedly do when called that way? Well, its argv
has type const char * const *
, so in that function, the expression *argv
type const char * const
. Likewise argv[n]
for integer n
. On the other hand, if we suppose that that argv
points to the same object that main
's argv
does, then that object has type char *
. Those are not compatible types, nor is const char * const
a qualified version of char *
,* so any such access violates the strict-aliasing rule, yielding undefined behavior.
parseArguments()
can convert its argv
to a different object pointer type, or store it in a variable, or pass it as a function argument, or examine its representation. It might be able to do pointer arithmetic with it. If it happens to convert it to type char * const *
or char **
then it can definedly dereference it in the particular example presented, but not otherwise. Supposing that you do want to be able to dereference it in the case where its argv
is (a converted version of) main
's argv
, you'd probably be best off declaring it as char **
, like main
's, or perhaps char * const *
if you want to generalize.
You wrote:
I know that passing a pointer to an unqualified type where a pointer to a const-qualified type is expected is fine (e.g. passing
char*
to a function expectingconst char*
) and that the implicit conversion does not apply to nested pointers (e.g.passing char**
to a function expectingconst char**
raises a warning).
"Raises a warning" is something that a compiler may choose to do under those circumstances. C does not define that or any other particular behavior in that case.
I am also aware of the example in the standard explaining the reason for this [...]
That's better described as a rationale for the spec's provisions in this area. The reason is that the spec says that the behavior is undefined. This distinction is important because the behavior would be undefined even if there were no issue such as the rationale describes. In particular, this ...
and I do not think it applies here since I am only passing the nested pointer to the function.
... is moot. It does not matter that the rationale might not apply to your case, as it is just an explanation of the reason for the rule, not the rule itself.** It does not matter that C's rule in this area is stricter than it needs to be (as C++ demonstrates). What you propose to do cannot be done in strictly conforming C. That is, making all the levels in the pointer derivation const
does not have the effect, as far as standard C is concerned, of allowing inputs with arbitrary combinations of const
qualification.
You could conceivably provide two versions of parseArguments()
, with a type-generic macro in front. Something like this, for example:
parseConstArguments(int argc, const char * const *argv);
parseModifiableArguments(int argc, char * const *argv);
#define parseArguments(argc, argv) _Generic(**(argv), \
const char: parseConstArguments, \
char: parseModifiableArguments \
)((argc), (argv))
That does still leave you providing two separate argument-parsing functions, of course, but users could call the appropriate one automatically via parseArguments(argc, argv)
. I guess that if you wanted to minimize code duplication between the two parser implementations then you could probably apply more macros to do so.
* Type char * const
would be a qualified version of char *
, but const char * const
is not. Nor is the latter a qualified version of a type compatible with char *
: const char *
is not compatible with char *
because const char
is not compatible with char
.
** Though I suspect that another reason for the rule, perhaps even more significant, is that it considerably simplifies this aspect of the spec. Notwithstanding that it fails to accommodate some things that seem sensible.