I am trying to do this exercise in C programming: "Write a program that takes a string as input and counts the frequency of each character.".
I have written this method:
void printCharactersFrequenciesOf(char s[]){
size_t stringlength = strlen(s); // length of string
char chars[stringlength]; // variable for the different characters in the string
int charsFrequencies[stringlength], charAlreadyExists, differentCharsNumber = 0; // variables for the frequency of each character in the string, a flag to know whether the character already exists in the characters array, and for the different characters number in the string
// putting the different characters of the string in the different characters array
for (int i = 0; i < stringlength; i++){
charAlreadyExists = 0;
for (int j = 0; j < i; j++){
if (s[i] == chars[j]){
charAlreadyExists = 1;
j = i; // break loop
}
}
if (charAlreadyExists == 0){
chars[differentCharsNumber] = s[i];
differentCharsNumber++;
}
}
chars[differentCharsNumber] = charsFrequencies[differentCharsNumber] = '\0'; // terminating the different characters array and the characters frequencies array with a null terminator if they're shorter than the length of the string
int charCount; // a counter variable for the number of appearance of each existing character
// getting character frequencies into the character frequencies array
for (int i = 0; i < differentCharsNumber; i++){
charCount = 0;
for (int j = 0; j < stringlength; j++){
if (chars[i] == s[j]){
charCount++;
}
}
charsFrequencies[i] += charCount;
}
// printing the frequencies of the different characters
for (int i = 0; i < differentCharsNumber; i++){
printf("Frequency of '%c': %d\n", chars[i], charsFrequencies[i]);
}
}
In this method I first put all the different characters from the source string in an array. Then I go through the different characters array, and for each character check the string and try to find the character frequency.
But unfortunately this code doesn't seem to work. It does get and prints the different characters of the string, but the frequencies are going crazy.
For example, for the string "Temme" I get:
Frequency of 'T': -1920988639
Frequency of 'e': -23
Frequency of 'm': -606004806
When I expect to get:
Frequency of 'T': 1
Frequency of 'e': 2
Frequency of 'm': 2
However, for the string "bb" I get:
Frequency of 'b': 2
As expected.
I'd like to know what I did wrong, even if this solution is not ideal.
Thanks in advance.
You are overcomplicating (and it is very hard to read your code because of the not very logical algorithm) a very simple function. Simple have an array long enough to accommodate the counts of all of your characters.
In this example code, I count characters from 32 to 127. You can change it to include (for example) control characters
#define MAX_ASCII 127
#define MIN_ASCII 32
size_t count(const char *str, size_t *arr)
{
size_t len = 0;
if(str && arr)
{
memset(arr, 0, (MAX_ASCII - MIN_ASCII + 1) * sizeof(*arr));
while(*str)
{
if(*str >= MIN_ASCII && (unsigned char)*str <= MAX_ASCII)
{
arr[*str - MIN_ASCII] += 1;
}
str++;
len++;
}
}
return len;
}
int main(void)
{
size_t freq[MAX_ASCII - MIN_ASCII + 1];
char *str = "Lorem Ipsum is simply dummy text of the printing and typesetting industry. Lorem Ipsum has been the industry's standard dummy text ever since the 1500s, when an unknown printer took a galley of type and scrambled it to make a type specimen book. It has survived not only five centuries, but also the leap into electronic typesetting, remaining essentially unchanged. It was popularised in the 1960s with the release of Letraset sheets containing Lorem Ipsum passages, and more recently with desktop publishing software like Aldus PageMaker including versions of Lorem Ipsum.";
size_t len = count(str, freq);
printf("Total length of the string is: %zu\n", len);
for(int i = 0; i <= MAX_ASCII - MIN_ASCII; i++)
{
if(freq[i])
printf("Char %03d ('%c') was found % 4zu times (% 6.2f%%)\n", i + MIN_ASCII,
i + MIN_ASCII, freq[i], (100.0 * freq[i]) / len);
}
}
https://godbolt.org/z/53corfj19
Result:
Total length of the string is: 574
Char 032 (' ') was found 90 times ( 15.68%)
Char 039 (''') was found 1 times ( 0.17%)
Char 044 (',') was found 4 times ( 0.70%)
Char 046 ('.') was found 4 times ( 0.70%)
Char 048 ('0') was found 3 times ( 0.52%)
Char 049 ('1') was found 2 times ( 0.35%)
Char 053 ('5') was found 1 times ( 0.17%)
Char 054 ('6') was found 1 times ( 0.17%)
Char 057 ('9') was found 1 times ( 0.17%)
Char 065 ('A') was found 1 times ( 0.17%)
Char 073 ('I') was found 6 times ( 1.05%)
Char 076 ('L') was found 5 times ( 0.87%)
Char 077 ('M') was found 1 times ( 0.17%)
Char 080 ('P') was found 1 times ( 0.17%)
Char 097 ('a') was found 28 times ( 4.88%)
Char 098 ('b') was found 5 times ( 0.87%)
Char 099 ('c') was found 10 times ( 1.74%)
Char 100 ('d') was found 16 times ( 2.79%)
Char 101 ('e') was found 59 times ( 10.28%)
Char 102 ('f') was found 6 times ( 1.05%)
Char 103 ('g') was found 11 times ( 1.92%)
Char 104 ('h') was found 14 times ( 2.44%)
Char 105 ('i') was found 32 times ( 5.57%)
Char 107 ('k') was found 7 times ( 1.22%)
Char 108 ('l') was found 17 times ( 2.96%)
Char 109 ('m') was found 18 times ( 3.14%)
Char 110 ('n') was found 38 times ( 6.62%)
Char 111 ('o') was found 25 times ( 4.36%)
Char 112 ('p') was found 18 times ( 3.14%)
Char 114 ('r') was found 24 times ( 4.18%)
Char 115 ('s') was found 39 times ( 6.79%)
Char 116 ('t') was found 43 times ( 7.49%)
Char 117 ('u') was found 17 times ( 2.96%)
Char 118 ('v') was found 5 times ( 0.87%)
Char 119 ('w') was found 6 times ( 1.05%)
Char 120 ('x') was found 2 times ( 0.35%)
Char 121 ('y') was found 13 times ( 2.26%)