Search code examples
cstringmallocconcatenationrealloc

Declare string in C without giving size


I want to concatenate in a string multiple sentences. At the moment my buffer is with fixed size 100, but I do not know the total count of the sentences to concatenate and this size can be not enough in the future. How can I define a string without defining its size?

char buffer[100]; 
int offset = sprintf (buffer, "%d plus %d is %d", 5, 3, 5+3);
offset += sprintf (buffer + offset, " and %d minus %d is %d", 6, 3, 6-3);
offset += sprintf (buffer + offset, " even more");
printf ("[%s]",buffer);  

Solution

  • This is a fundamental aspect of C. C never does automatic management of dynamically-constructed strings for you — this is always your responsibility.

    Here is an outline of four different techniques you might use. You can ask additional questions about any of these that aren't clear.

    1. Run through your string-construction process twice. Make one pass to collect the lengths of all the substrings, then call malloc to allocate a buffer of the computed size, then make a second pass to actually construct your string.

    2. Allocate a smallish (or empty) initial buffer with malloc, and then, each time you're about to append a new substring to it, check the buffer's size, and if necessary grow it bigger using realloc. (In this case I always use three variables: (1) pointer to buffer, (2) allocated size of buffer, (3) number of characters currently in buffer. The goal is to always keep (2) ≥ (3).)

    3. Allocate a dynamically-growing "memstream" and use fprintf or the like to "print" to it. This is an ideal technique, although memstreams are not standard and not supported on all platforms, and dynamically-allocating memstreams are even more exotic and less common. (It's possible to write your own, but it's a lot of work.) You can open a fixed-size memstream using fmemopen (although this is not what you want), and you can open the holy grail, a dynamically-allocating memstream (which is what you want) using open_memstream, if you have it. Both are documented on this man page. (This technique is analogous to stringstream in C++.)

    4. The "better to beg forgiveness than ask permission" technique. You can allocate a buffer which you're pretty sure is amply big enough, then blindly stuff all your substrings into it, then at the very end call strlen on it and, if you guessed wrong and the string is longer than the buffer you allocated, print a big scary noisy error message and abort. This is a blunt and risky technique, not one you'd use in a production program. It's possible that when you overflow the buffer, you damage things in a way that causes the program to crash before it has a chance to perform its belated check-and-maybe-exit step at all. If you used this technique at all, it would be considerably "safer" (that is, considerably less likely to prematurely crash before the check) if you allocated the buffer using malloc, than if you declared it as an ordinary, fixed-size array (whether static or local).

    Personally, I've used all four of these. Out in the rest of the world, numbers 1 and 2 are both in common use by just about everybody. Comparing them: number 1 is simple and a bit easier but has an uncomfortable amount of code replication (and might therefore be brittle if new strings are added later); number 2 is more robust but obviously requires you to be comfortable with the way realloc works (and this technique can be less than robust in its own way, if you've got any auxiliary pointers into your buffer which need to be ponderously relocated each time realloc is called).

    Number 3 is an "exotic" technique: theoretically almost ideal, but definitely more sophisticated and necessitating some extra support, since nothing like open_memstream is standard. And number 4 is obviously a risky and inherently not reliable technique, which you'd use — if at all — only in throwaway or prototype code, never in production.