Search code examples
cfilescanf

scanning in specific word from a file


I have this file called pageRankList that contains url, number of outgoing links, page rank in that order. if I want to get the pageRank of a given URL. How could I do this with fscanf or other functions?

url23 4 0.0405449
url31 3 0.0371111
url22 5 0.0300785
url34 4 0.0288782
url21 2 0.0247087
url11 3 0.0235192
url32 2 0.0227647

this is what I have so far but when I run it gives me a SEGV on unknown address error and I can't figure out why :(

static double getPageRank(char *url) {
    double pageRank = 0;
    FILE *fp = fopen("pageRankList.txt", "r");
    char str[1000];

    int counter = 0;
    while (fscanf(fp, " %98s", str) != EOF) {
        if (strcmp(url, str) == 0) {
            counter++;
            continue;
        }

        if (counter == 2) {
            pageRank = atof(str);
            printf("%f\n", pageRank);
            break;
        }
    }
    fclose(fp);
    return pageRank;
}

Solution

  • fscanf(fp, " %98s", str)

    This will stop reading when it hits a white space. It might be better to read all three things (url, number of outgoing links, page rank) at once. I would read a whole line at a time and then use sscanf on that:

    static double getPageRank(const char* url)
    {
        FILE* fp = fopen("pageRankList.txt", "r");
        if (!fp) return -1;
    
        char str[1000];
        double pageRank = -1;
    
        while (fgets(str, sizeof(str), fp)) { // Read line
            int number;
            char line_url[100];
            // Try to parse line
            if (sscanf(str, "%99s %d %lf", line_url, &number, &pageRank) == 3) {
                if (strcmp(url, line_url) == 0) {
                    break;
                }
            }
        }
        fclose(fp);
        return pageRank;
    }
    

    *Note this works as long as url has no spaces.