I'm writing a C program to find the longest line in the user's input and print the line's length and the line itself. It succeeds at counting the characters but unpredictably fails at storing the line itself. Maybe I'm misunderstanding C's memory management and someone can correct me.
EDIT: followup question: I understand now that the blocks following the dummy
char are unallocated and thus open range for the computer to do anything with them, but then why does the storage of some chars still work? In the second example I mention, the program stores characters in the 'unallocated' blocks even though it 'shouldn't'. Why?
Variables:
getchar()
is stored in c
every time i getchar()
i
is the length (so far) of the current line i'm getchar()
ing fromlongest_i
is the length of the longest line so fartwostr
points to the beginning of the first of two strings: the first for the current line, the second for the longest line so far. When a line is discovered to be the longest, it is copied into the second string. If a future line is even longer, it overrides some of the second string but that's OK because I won't use it anymore -- the second string will now begin at a location farther to the right.dummy
gives twostr
a place to point toThis is how I visualize the memory used by the program's variables:
11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38
+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+
|\n| 7|11|15|c |u |r |r |e |n |t |\0|e |s |t |\0|p |r |e |v |l |o |n |g |e |s |t |\0|
+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+
true statements:
&c == 11
&i == 12
&longest_i == 13
&twostr = 14
&dummy = 15
program:
#include <stdio.h>
int main()
{
char c = '\0';
int i, longest_i;
char *twostr;
longest_i = i = 0;
char dummy = '\0';
twostr = &dummy;
while ((c=getchar()) != EOF)
{
if (c != '\n')
{
*(twostr+i) = c;
i++;
}
else
{
*(twostr+i) = '\0';
if (i > longest_i)
{
longest_i = i;
for (i=0; (c=*(twostr+i)) != '\0'; ++i)
*(twostr+longest_i+1+i) = c;
}
i = 0;
}
}
printf("length is %d\n", longest_i);
for (i=0; (c=*(twostr+longest_i+1+i)) != '\0'; ++i)
putchar(c);
return 0;
}
From *(twostr+longest_i+1))
until '\0'
is unpredictable. Examples:
input:
longer line
line
output:
length is 11
@
input:
this is a line
this is a longer line
shorter line
output:
length is 21
this is a longer lineÔÿ"
First, you will need to make sure that twostr has sufficient space to hold the string the string that you're managing. You will likely need to add some additional logic to allocate initial space as well as to allocate additional space when needed. Something like:
size_t twostrLen = 256;
char* twostr = malloc(twostrLen);
Then inserting data into this, you'll need to make sure you allocate additional memory if your index will exceed the current length of twostrLen:
if (i >= twostrLen) {
char* tmp = twostr;
twostrLen *= 2;
twostr = malloc(twostrLen);
memcpy(twostr, tmp, i-1);
free(tmp);
}
Where i
is the offset from twostr
that you're about to write to.
Finally, when copying from the current string to the longest string, your loop termination condition is c=*(twostr+i)) != '\0'
. This will trigger when c
matches '\0'
, exiting the loop before the terminating null is written. You'll need to make sure the null is written in order for your loop to print the string will work correctly. Adding the following after your inner-most for loop should address the issue:
*(twostr+longest_i+1+i) = 0;
Without this, our last loop will continue to read until a null character is encountered. This could be immediately (as seen in your first example where it appears to work), or could be some number of bytes later (like your second example, where additional characters are printed).
Again, remember to check that longest_i+1+i < twostrLen
before writing to that location.