Search code examples
csortinggrouping

How to group rows with the same column value into one row in C?


I am trying to make a statistic calculation code where a file with years are given on column1, type is given on column2 and count is given for column3. And finding the sum of counts for each type for each year.

I am stuck on grouping my data with same value in the column to be in the same row..

Input data:

2010    101     22
2010    101     40
2010    101     44
2010    101     66
2010    102     14
2010    100     7
2010    101     2
2010    101     3
2010    101     2
2010    101     3
2011    101     23
2011    101     27
2011    101     47
2011    101     66
2011    100     5
2011    102     16
2011    101     4
2011    101     1
2011    101     3
2011    101     5

Output:

| Year | 100 | 101 | 102 |
--------------------------
| 2010 |  7  | 182 |  14 |
| 2011 |  5  | 176 |  16 |

I could do

if(year == 2010)
{}
if(year == 2011)
{}

but my data is not going to always be like the given input. Is there a way to group them without knowing how many rows and what is going to be given in the year column? Maybe comparing row by row?

I'm confused, please help..


Solution

  • With a fixed/limited range for years and types, we can used a fixed size 2D array to hold the counts.

    Here's some code that does that:

    #include <stdio.h>
    #include <stdlib.h>
    
    // year config
    #define YRLO    2008
    #define YRHI    2020
    #define YRTOT   ((YRHI - YRLO) + 1)
    
    // type config
    #define TYPLO   100
    #define TYPHI   103
    #define TYPTOT  ((TYPHI - TYPLO) + 1)
    
    #define RANGE(_val,_lo,_hi) \
        ((_val) >= (_lo)) && ((_val) <= (_hi))
    
    int counts[YRTOT][TYPTOT];
    
    int
    main(void)
    {
    
        const char *file = "data.txt";
        FILE *xfin = fopen(file,"r");
    
        if (xfin == NULL) {
            perror(file);
            exit(1);
        }
    
        int yr;
        int typ;
        int cnt;
        int lno = 0;
    
        while (fscanf(xfin,"%d %d %d",&yr,&typ,&cnt) == 3) {
            ++lno;
    
            if (! RANGE(yr,YRLO,YRHI)) {
                printf("line %d -- bad year -- %d\n",lno,yr);
                continue;
            }
    
            if (! RANGE(typ,TYPLO,TYPHI)) {
                printf("line %d -- bad type -- %d\n",lno,typ);
                continue;
            }
    
            // store data (convert absolute years and types to relative numbers)
            counts[yr - YRLO][typ - TYPLO] += cnt;
        }
    
        fclose(xfin);
    
        const char *fmt = " | %8d";
    
        int totlen = 0;
    
        // print the title
        totlen += printf("| Year");
        for (int tidx = 0;  tidx < TYPTOT;  ++tidx)
            totlen += printf(fmt,tidx + TYPLO);
        totlen += printf(" |");
        printf("\n");
    
        // print a dashed line
        for (int icol = 0;  icol < totlen;  ++icol)
            fputc('-',stdout);
        printf("\n");
    
        for (int yidx = 0;  yidx < YRTOT;  ++yidx) {
            // A nicety: decide if year has any non-zero counts
            int hasdata = 0;
            for (int tidx = 0;  tidx < TYPTOT;  ++tidx) {
                if (counts[yidx][tidx]) {
                    hasdata = 1;
                    break;
                }
            }
    
            // skip any years that have no data [optional]
            if (! hasdata)
                continue;
    
            // output the absolute year
            printf("| %d",yidx + YRLO);
    
            // output the counts for each type
            for (int tidx = 0;  tidx < TYPTOT;  ++tidx)
                printf(fmt,counts[yidx][tidx]);
    
            printf(" |\n");
        }
    
        return 0;
    }
    

    Here's the program output for your given input data:

    | Year |      100 |      101 |      102 |      103 |
    ----------------------------------------------------
    | 2010 |        7 |      182 |       14 |        0 |
    | 2011 |        5 |      176 |       16 |        0 |