Search code examples
cprintfcharacterclion

Unrecognized Characters Getting Added to Output


I was programming on CLion 2021.1.2 and I received some unrecognized characters in the output of my program that involved the usage of the getchar() function and strings. The goal of the program was to copy the input, replace one or more blanks (i.e. ) that are placed together with just one blank and then print the output. The output string contained some unrecognized characters in the form of diamond-boxed question marks, which I didn't understand why. Below is my code and two sample input-output pairs for reference:

My Code:

#include <stdio.h>

int main() {

  int c, i = 0;
  char s[100], g; // i am restricting the length of the string to 100

  while ((c = getchar()) != EOF) {
      if (i == 0)
      {
          i++;
          g = (char) c;
          s[0] = (char) c;
          continue;
      }
      if ((c == ' ' ) && (g == ' ')) 
      {
          continue;
      }

      s[i] = (char) c;
      g = (char) c;
      i++;
  }
  printf("%s\n", s);
  return 0;
}

Input 1:

Hello, This is me.     Welcome
Hi      Hello hello
Just Kidding   This is me
123  456 789 111^D

Output 1:

Hello, This is me. Welcome
Hi Hello hello
Just Kidding This is me
��������������������������������������B

Input 2:

123 456   789 abc
\n \t 123 145 *&$&)$@
1234567805018308513
^D

Output 2:

123 456 789 abc
\n \t 123 145 *&$&)$@
1234567805018308513
����������������������������������������������:

The ^D in the input indicates my use of Ctrl+D for EOF to be read by getchar().
As expected, the extra blanks have been removed from the input while returning the output, but these unrecognized characters also get printed, which confuses me.
In these unrecognized characters, the number of characters seem to be changing, and the last character (the one right after all the diamond-boxed question marks) is a recognized character, but is also unnecessary.

I have a few questions regarding this:

  1. Why is this happening? Could this be anything to do with the length of the string the restriction I placed as 100?
  2. Is this something to do with the IDE, or my algorithm?
  3. How exactly does the copying of values take place? Are additional characters present in the functioning of string addition?
  4. Can this problem be rectified with the help of another method or function?

Thank you, any help regarding this would be appreciated.


Solution

  • As @kaylum pointed out, you absolutely need to terminate your string before printing it. As good practice, you might also want to give your variables meaningful names. Also, the use of continue's is not needed when else's will work equally well. In addition, since you have a limited-length string, it's good practice to do a bounds-check. Perhaps you want something like this:

    int main()
    {
        int chr, idx = 0;
        char str[100], last_chr = '\0';
        
        while ((chr = getchar()) != EOF  &&  idx < 99) {
            if (last_chr != ' '  ||  chr != ' ')
                last_chr = str[idx++] = chr;
        }
        str[idx] = '\0';
        printf("%s\n", s);
        return 0;
    }
    

    Note that initializing last_chr (your old g) to a non-space value eliminates the need for another test in your loop.

    BTW, the diamond-question-mark character is a graphic printed when a character is not in your system font.