I'm finally finishing K&R, but encountered yet another unclear code. Chapter 6.3/6.4
Referring to getword. How can it return int and that is supposed to be a word. I understand that it returns word[0] which is the first letter. However in my opinion, if I wanted to return a word, I'd introduce something like char *getword. Am I right?
How is int able to indicate it is a word?
Still about getword: Supposing I enter "in " and after space I push enter. getword reads 'i' as it is not a space, and isalpha so the first if is omitted. What happens then?
I marked the line in binsearch. Don't You think it should be high = mid - 1; there?
int getword(char *word, int lim) {
char *w = word;
int c;
while (isspace(c = getch()))
{}
if (c != EOF) {
*w++ = c;
}
if (!isalpha(c)) {
*w = '\0';
return c;
}
for ( ; --lim > 0; w++) {
if (!isalnum(*w = getch())) {
ungetch(*w);
break;
}
}
*w = '\0';
return word[0];
}
/* binsearch: find word in tab[0]...tab[n-1] */
struct key *binsearch(char *word, struck key *tab, int n)
{
int cond;
struct key *low = &tab[0];
struct key *high = &tab[n];
struct key *mid;
while (low < high) {
mid = low + (high-low) / 2;
if ((cond = strcmp(word, mid->word)) < 0)
high = mid; /* [3] */
else if (cond > 0)
low = mid + 1;
else
return mid;
}
return NULL;
}
You are correct, if the function would return a word, it'd be rather char *getword()
. However, according to K&R
The function value is the first character of the word, or EOF for end of file, or the character itself if it is not alphabetic
Returning an int
is ok, as in C, a character is like an int
having only 8 bits, in the [-128, +127] range.
So where the word is returned?
In the char *word
given as parameter. Initially char *w
gets a copy of the word
pointer, and then the characters read are set into the memory pointed to by w
.
Having "in "
in the input buffer, isspace
would return false, and c
is assigned the non-space character. Then, *w++
put that character at position [0] of word (i) increments the w
pointer (++
). word[0]
contains 'i'.
The !isalpha
test is false, thus that part is skipped.
Then characters are read from the input and stored into the next w
position, until a non alphanumeric entry is read (or limit lim
is reached) - in this non-alphanumeric case, the character read is actually put back into the input buffer, and w
- which contains that undesired char - is not incremented (due to break
). Then the following *w = '\0'
overwrites that non-alpha char, and "close" the C string (in C strings ends with a character having a 0 value).
In your example, that stores 'n' in w
, increments w
, then stores ' ' into w
and performs the code for !isalnum
, i.e. breaks the loop. Then since w
was not incremented after storing ' ', the *w = '\0'
replaces the space, and "closes" the string.
[the other half of the question has already been answered by someone else]