Background:
I'm trying to create a program that takes a user name(assuming that input is clean), and prints out the initials of the name.
Objective:
Code:
#include <cs50.h>
#include <stdio.h>
#include <string.h>
#include <ctype.h>
string prompt(void);
char *getInitials(string input);
char *appendArray(char *output,char c,int count);
//Tracks # of initials
int counter = 0;
int main(void){
string input = prompt();
char *output = getInitials(input);
for(int i = 0; i < counter ; i++){
printf("%c",toupper(output[i]));
}
}
string prompt(void){
string input;
do{
printf("Please enter your name: ");
input = get_string();
}while(input == NULL);
return input;
}
char *getInitials(string input){
bool initials = true;
char *output;
output = malloc(sizeof(char) * counter);
for(int i = 0, n = strlen(input); i < n ; i++){
//32 -> ASCII code for spacebar
//9 -> ASCII code for tab
if(input[i] == 32 || input[i] == 9 ){
//Next char after spaces/tab will be initial
initials = true;
}else{//Not space/tab
if(initials == true){
counter++;
output = appendArray(output,input[i],counter);
initials = false;
}
}
// eprintf("Input[i] is : %c\n",input[i]);
// eprintf("Counter is : %i\n",counter);
// eprintf("i is : %i\n",i);
// eprintf("n is : %i\n",n);
}
return output;
}
char *appendArray(char *output,char c,int count){
// allocate an array of some initial (fairly small) size;
// read into this array, keeping track of how many elements you've read;
// once the array is full, reallocate it, doubling the size and preserving (i.e. copying) the contents;
// repeat until done.
//pointer to memory
char *data = malloc(0);
//Increase array size by 1
data = realloc(output,sizeof(char) * count);
//append the latest initial
strcat(data,&c);
printf("Value of c is :%c\n",c);
printf("Value of &c is :%s\n",&c);
for(int i = 0; i< count ; i++){
printf("Output: %c\n",data[i]);
}
return data;
}
Problem:
The output is not what i expected as there is a mysterious P appearing in the output.
E.g When i enter the name Barack Obama, instead of getting the result:BO, i get the result BP and the same happens for whatever name i choose to enter, with the last initial always being P.
Output:
Please enter your name: Barack Obama
Value of c is :B
Value of &c is :BP
Output: B
Value of c is :O
Value of &c is :OP
Output: B
Output: P
BP
What i've done:
I've traced the problem to the appendArray function, and more specifically to the value of &c (Address of c) though i have no idea what's causing the P to appear,what it means, why it appears and how i can get rid of it.
The value of P shows up no matter when i input.
Insights as to why it's happening and what i can do to solve it will be much appreciated.
Thanks!
Several issues, in decreasing order of importance...
First issue - c
in appendArray
is not a string - it is not a sequence of character values terminated by a 0. c
is a single char
object, storing a single char
value.
When you try to print c
as a string, as in
printf("Value of &c is :%s\n",&c);
printf
writes out the sequence of character values starting at the address of c
until it sees a 0-valued byte. For whatever reason, the byte immediately following c
contains the value 80, which is the ASCII (or UTF-8) code for the character 'P'
. The next byte contains a 0 (or there's a sequence of bytes containing non-printable characters, followed by a 0-valued byte).
Similarly, using &c
as the argument to strcat
is inappropriate, since c
is not a string. Instead, you should do something like
data[count-1] = c;
Secondly, if you want to treat the data
array as a string, you must make sure to size it at least 1 more than the number of initials and write a 0 to the final element:
data[count-1] = 0; // after all initials have been stored to data
Third,
char *data = malloc(0);
serves no purpose, the behavior is implementation-defined, and you immediately overwrite the result of malloc(0)
with a call to realloc
:
data = realloc(output,sizeof(char) * count);
So, get rid of the malloc(0)
call altogether; either just initialize data
to NULL
, or initialize it with the realloc
call:
char *data = realloc( output, sizeof(char) * count );
Fourth, avoid using "magic numbers" - numeric constants with meaning beyond their immediate, literal value. When you want to compare against character values, use character constants. IOW, change
if(input[i] == 32 || input[i] == 9 ){
to
if ( input[i] == ' ' || input[i] == '\t' )
That way you don't have to worry about whether the character encoding is ASCII, UTF-8, EBCDIC, or some other system. ' '
means space everywhere, '\t'
means tab everywhere.
Finally...
I know part of your motivation for this exercise is to get familiar with malloc
and realloc
, but I want to caution you about some things:
realloc
is potentially an expensive operation, it may move data to a new location, and it may fail. You really don't want to realloc
a buffer a byte at a time. Instead, it's better to realloc
in chunks. A typical strategy is to multiply the current buffer size by some factor > 1 (typically doubling):
char *tmp = realloc( data, current_size * 2 );
if ( tmp )
{
current_size *= 2;
data = tmp;
}
You should always check the result of a malloc
, calloc
, or realloc
call to make sure it succeeded before attempting to access that memory.
Minor stylistic notes:
Avoid global variables where you can. There's no reason counter
should be global, especially since you pass it as an argument to appendArray
. Declare it local to main
and pass it as an argument (by reference) to getInput
:
int main( void )
{
int counter = 0;
...
char *output = getInitials( input, &counter );
for(int i = 0; i < counter ; i++)
{
printf("%c",toupper(output[i]));
}
...
}
/**
* The "string" typedef is an abomination that *will* lead you astray,
* and I want to have words with whoever created the CS50 header.
*
* They're trying to abstract away the concept of a "string" in C, but
* they've done it in such a way that the abstraction is "leaky" -
* in order to use and access the input object correctly, you *need to know*
* the representation behind the typedef, which in this case is `char *`.
*
* Secondly, not every `char *` object points to the beginning of a
* *string*.
*
* Hiding pointer types behind typedefs is almost always bad juju.
*/
char *getInitials( const char *input, int *counter )
{
...
(*counter)++; // parens are necessary here
output = appendArray(output,input[i],*counter); // need leading * here
...
}