Search code examples
csortingreallocfile-readisalpha

C programming: Problems reading a filetext and trying to sort out the longest word


I am beginner at coding, so I might be doing quite a few rookie mistakes here and there. We got this task in school and the goal is to sort out the longest word and print it together with the number of characters it has. I have gotten this far but from here and on I have a hard time finding where the problems lies. The program gets stuck on iteration 49,50,51 and 59 most of the times. I think it is because the realloc returns NULL for the longestWord variable.

Any ideas where to look trying to fix these issues? Thanks in advance guys!

Input:

abc
abcde
abcdefghij
abcdefghij
abcdefghijklmnopq
abcdefghijklmnopq
abcdefghijklmnop
auf wiedersehen

Expected output:

17 characters in longest word: abcdefghijklmnopq

My Code so far:

#include <stdio.h>
#include <stdlib.h>

//_________//

FILE* fptr;
int c;
int iteration=0;     //Just to keep track 



//___________Main____________//

int main()
{

    fptr = fopen("C:\\....\\input", "r");

    char *s;
    char *longestWord;
    int i=1, charCount=0;

    s = (char*) malloc (sizeof(char));
    longestWord = (char*) malloc (sizeof(char));

    while((c=fgetc(fptr)) != EOF){
        iteration++;
        printf("Iteration %d\n",iteration);
        if (isalpha(c)!=0 ){

            s=realloc(s,i*sizeof(char));
            s[i]=c;
            i++;
        }

        else if(c==' ' || c =='\n'){
            if(i>charCount){
                charCount=i-1;
                longestWord=realloc(longestWord,i*sizeof(char));


                while(longestWord == NULL){
                         longestWord=realloc(longestWord,i*sizeof(char));
                }
                for(int t=0;t<i;t++){
                        *(longestWord+t)=*(s+t);

                }

                i=1;

            }

            else{
               printf("*********Checkpoint 3***************\n");                //Checkpoint 3
               i=1;

            }

        }

        else{
            printf("\n\n********Error, got to the else section of the program********\n\n");
        }

    }

    printf("%d characters in the longest word: %s\n",charCount, longestWord);

    free(s);
    free(longestWord);
    fclose(fptr);
    return 0;

} //_____________END OF MAIN____________ //


Solution

  • Here is an updated version of your code that does what you requested.

    #include <stdio.h>
    #include <string.h>
    #include <stdlib.h>
    #include <ctype.h>
    
    int main()
    {
        FILE* fptr;
        int c;
        char *s;
        char *longestWord;
        int i=0;
    
        fptr = fopen("input.txt", "r"); // TODO: validate if fptr is not NULL
    
        // TODO: validate the return of mallocs
        s = (char*) malloc (sizeof(char)); // allocates 1 element
        longestWord = (char*) malloc(sizeof(char));
    
        while((c=fgetc(fptr)) != EOF){      
            if (isalpha(c) != 0 ){
                s=realloc(s, strlen(s)+1);
                s[i]=c;
                i++;
            }
            else if(c==' ' || c =='\n'){            
                s[i] = '\0';
    
                // check if it is the longest
                if(strlen(s) > strlen(longestWord)) {
                    longestWord = realloc(longestWord, strlen(s)+1);
                    strcpy(longestWord, s);
                }
    
                memset(s, '\0', strlen(s)+1);
                i=0;
            }
            else{
                printf("Weird character %c\n", c);
            }
        }
    
        printf("%ld characters in the longest word: %s\n", strlen(longestWord), longestWord);
    
        free(s);
        free(longestWord);
        fclose(fptr);
        return 0;
    }
    

    Please note the following:

    • Validations of the return values of functions such as fopen, malloc, ..., are missing;
    • Global variables do not make sense in this particular program and therefore were moved inside the main function;
    • Words with characters that are not part of [a-zA-Z] are ignored.