Search code examples
carraysalgorithmloopskernighan-and-ritchie

Extracting unique elements in an array (from K and R C ex1-14)


Returning C newb here again. I am trying my hand at the exercises in K and R C and on my way to trying exercise 1-14, I am really stumped. My solution works but is not always correct, I am seeking help to refine what I have written, or if there is a better (easier-to-understand!) way! My script:

#include <stdio.h>

/* How many times a character appears in an array */
main()
{
    int c;
    int count = 0;
    int uniquecount = 0;
    char array[20];
array[0] = '\0';

while((c = getchar()) != EOF)
{
    array[++count] = c;
}

/* for each element in array,
 * check if array[each] in newarray.
 * if array[each] in newarray
 * break and start checking again.
 * if array[each] not in newarray
 * add array[each] to end of newarray*/

printf("count = %d\n", count);
array[count] = '\0';
char newarray[count];
newarray[0] = '\0';

for(int a = 0; a < count; ++a)
{
    for(int b = 0; b <= a; ++b)
    {
        if(newarray[b] == array[a])
            break;
        if(newarray[b] != array[b])
        {
            newarray[b] = array[b];
            ++uniquecount;
        }

    }
}

printf("uniquecount = %d\n", uniquecount);    
newarray[uniquecount + 1] = '\0';

printf("array => ");
for(int i = 0; i < count; ++i)
    printf("\'%c\'", array[i]);
printf("\n");
printf("newarray => ");
for(int i = 0; i < uniquecount + 1; ++i)
{
    if(newarray[i] != '\0')
        printf("\'%c\'", newarray[i]);
}
printf("\n");

}

When I try some simple strings, it works and sometimes it doesnt:

./times_in_array 
this is 
count = 9
uniquecount = 5
array => '''t''h''i''s'' ''i''s'' '
newarray => 't''h''i''s'' '
 ./times_in_array 
something comes
count = 16
uniquecount = 11
array => '''s''o''m''e''t''h''i''n''g'' ''c''o''m''e''s'
newarray => 's''o''m''e''t''h''i''n''g'' ''c'
 ./times_in_array 
another goes
count = 13
uniquecount = 12
array => '''a''n''o''t''h''e''r'' ''g''o''e''s'
newarray => 'a''n''o''t''h''e''r'' ''g''o''e''s'

Please can someone guide me to where I am wrong? Thanks a lot!


Solution

  • We most oftenly write int main(void) instead of main().

    Moreover, it's very important to understand that when you are accessing your new array like this:

    newarray[b]
    

    you are actually accessing uninitialized memory. However, since the probability of having a random character there match array[a] is too small, you get away with it, it seems.

    For that reason I suggest you initialize your new array, like this:

    for(int i = 0; i < count; ++i)
        newarray[i] = '\0';
    

    Now, instead of providing my solution, I insist on making you understand what you did wrong; that's the way you are educating yourself. It will be really helpful to print your data at the start of the inner loop, like this for example:

    printf("b = %d, a = %d, newarray[b] = %c, array[a] = %c, array[b] = %c\n", b, a, newarray[b], array[a], array[b]);
    

    and print a message when you increase the unique counter, like this:

    printf("UNIQUE, %d\n", uniquecount);
    

    When you execute your program, you will see:

    ...
    b = 9, a = 12, newarray[b] = g, array[a] = s, array[b] = g
    b = 10, a = 12, newarray[b] = , array[a] = s, array[b] = o
    UNIEUQ, 10
    b = 11, a = 12, newarray[b] = , array[a] = s, array[b] = e
    UNIEUQ, 11
    b = 12, a = 12, newarray[b] = , array[a] = s, array[b] = s
    UNIEUQ, 12
    uniquecount = 12
    array => '''a''n''o''t''h''e''r'' ''g''o''e''s'
    newarray => 'a''n''o''t''h''e''r'' ''g''o''e''s'
    

    which strongly hints you on what's wrong. After 's' (the last letter of goes) is found as unique for its first time (that's good), we do not break the loop, thus your code will continue to check the new array and be fooled that this 's' is unique again.

    So add a break when you find a unique element, and you should be fine for the moment:

    ++uniquecount;
    break;