I don't understand how works this sort.Why is it so? And how we compare characters in C , how can I understand which of them less or greater than other? It's from book.
Call the function compareStrings and have it return the value –1 if the first string is lexicographically less than the secondstring, 0 if the two strings are equal, and 1 if the first string is lexicographically greater than the second string. So, the function call
compareStrings ("alpha", "altered")
returns the value –1 because the first string is lexicographically less than the second string (think of this to mean that the first string occurs before the second string in a dictionary).And, the function call
compareStrings ("zioty", "yucca");
returns the value 1 because “zioty” is lexicographically greater than “yucca.”
#include <stdio.h>
#include <stdbool.h>
struct entry
{
char word[15];
char definition[50];
};
bool equalStrings(char s1[], char s2[])
{
bool areEqual = false;
int i = 0;
while(s1[i] != '\0' && s2[i] != '\0' && s1[i] == s2[i])
i++;
if(s1[i] == '\0' && s2[i] == '\0')
areEqual = true;
else
areEqual = false;
return areEqual;
}
int lookup (struct entry dictionary[], char search[],int entries)
{
int low = 0;
int high = entries - 1;
int mid, result;
while (low <= high)
{
mid = (low + high) / 2;
result = compareStrings (dictionary[mid].word, search);
if(result == -1)
low = mid + 1;
else if(result == 1)
high = mid - 1;
else
return mid;
}
return -1;
}
int compareStrings(char s1[], char s2[])
{
int i = 0, answer;
while(s1[i] == s2[i] && s1[i] != '\0' && s2[i] != '\0')
i++;
if(s1[i] < s2[i])
answer = -1;
else if(s1[i] == s2[i])
answer = 0;
else
answer = 1;
return answer;
}
int main()
{
struct entry dictionary[100] =
{
{"aardvark", "a burrowing African mammal" },
{"acumen", " a bottomless pit "},
{"addle", "to become confused "},
{"agar", "a jelly made from seaweed" }
{"affix", "to append; attach"}
};
char word[15];
int entries = 10;
int entry;
printf("Enter word: ");
scanf("%14s", word);
entry = lookup (dictionary, word, entries);
if(entry != -1)
printf("%s\n", dictionary[entry].definition);
else
printf("Sorry, the word %s is not in my dictionary.\n", word);
return 0;
}
How understand that one greater than another? Thanks.
Lexicographical order is like dictionary order, a
comes before b
... with the additional behavior that case matters, uppercase coming before lowercase, and non alphabetic characters also compare based on their encoding.
This comparison is performed character by character. If all characters in both strings are equal, the strings compare equal, the return value is 0
, otherwise the relative order is determined by the comparison of the first mismatched character. If s1[i] < s2[i]
, string s1
comes before s2
, the return value is negative, otherwise s1
comes after s2
, the return value is positive.
Note however that this book implementation is sub-optimal and potentially incorrect:
s1[i] == s2[i]
and s1[i] != '\0'
, comparing s2[i]
to '\0'
is redundant.unsigned char
so extended characters come after regular characters. strcmp
is specified this way.Here is a simplified version:
int compareStrings(const char *s1, const char *s2) {
size_t i;
for (i = 0; s1[i] == s2[i]; i++) {
if (s1[i] == '\0')
return 0;
}
if ((unsigned char)s1[i] < (unsigned char)s2[i])
return -1;
else
return 1;
}
Further notes:
the dictionary
defined in main
is not sorted, agar
should come after affix
. The lookup
function relies on the array of entry
structures being sorted. It might not find the correct entry for some input words.
the lookup
function uses a binary search algorithm. This method is very efficient, but the implementation has a classic bug: mid = (low + high) / 2;
can overflow for large arrays, causing undefined behavior. It should be written:
mid = low + (high - low) / 2;