Search code examples
c++delimitergetline

How to extract data between two delimiters '|' from a text file using getline function in c++?


There's data arranged between delimiters in a file which looks like this: | No. | name. | product. | price. | I want to read one data at a time. Each section partitioned by these delimiters contains one or more syllables (one or few words). How to I read this ?


Solution

  • The basic approach for a delimited file would be to read the entire line into a string, then create a stringstream from the line and loop over the stringstream calling getline with the delimiter ('|' here) to separate the value into the fields.

    (this way the number of fields need not be known ahead of time -- without stringstream you must limit your read of fields to the number present using getline or getline will happily begin reading data from the next line)

    Below the example just reads each field of the input file into a string, but for your needs you may want an additional conversion, (e.g. stol or the like) to convert the individual fields to int or double (that is left to you).

    #include <iostream>
    #include <fstream>
    #include <sstream>
    #include <string>
    #include <limits>
    
    using namespace std;
    
    int main (int argc, char **argv) {
    
        string line;
    
        if (argc < 2) {
            cerr << "error: usage, program <filename>\n";
            return 1;
        }
    
        ifstream f (argv[1]);   /* open file */
        if (!f.is_open()) {     /* validate file open for reading */
            perror (("error while opening file " + string(argv[1])).c_str());
            return 1;
        }
    
        while (getline (f, line)) {         /* read each line */
            string val;                     /* string to hold value */
            stringstream s (line);          /* stringstream to parse fields */
            /* output original line */
            cout << "line: " << line << "\n\n";
            /* skip 1st delim */
            s.ignore(numeric_limits<streamsize>::max(), '|');
            while (getline (s, val, '|'))   /* for each field */
                cout << "  value: " << val << '\n'; /* output each field value */
            cout << '\n';
        }
    }
    

    Example Input File

    $ cat dat/fields.txt
    | 1   | Joe   | marbles  |   1.25 |
    | 2   | Mike  | jacks    |  13.49 |
    | 3   | Jill  | pail     |   4.50 |
    

    Example Use/Output

    $ ./bin/iostream_strstream_fields dat/fields.txt
    line: | 1   | Joe   | marbles  |   1.25 |
    
      value:  1
      value:  Joe
      value:  marbles
      value:    1.25
    
    line: | 2   | Mike  | jacks    |  13.49 |
    
      value:  2
      value:  Mike
      value:  jacks
      value:   13.49
    
    line: | 3   | Jill  | pail     |   4.50 |
    
      value:  3
      value:  Jill
      value:  pail
      value:    4.50
    

    Look things over and let me know if you have further questions.


    Rewrite in C for 32-year old Operating System

    Since you are attempting to compile with a still unidentified C++ compiler on a 32 year old operating system, (none of which was specified in your original question), you are probably much better served writing your parse routine in C. Why? The C++ standard was very amorphous at that time. It was more a superset of C then compared to what C++ is today. In the Turbo C and Borland C++ compilers of that day, all included files still used the C header file include format (which contain the '.h' at the end of the header name, e.g. #include <string.h>) This explains your "failure to find header" issue.

    C on the otherhand was much closer to what is available today. Though it has gone through several standard revisions, the basic function available then and now are the same. The following has a much better chance at compiling on whatever you have DOS 3.2 running on that the C++ code.

    #include <stdio.h>
    #include <string.h>
    
    #define FLDSZ 32   /* maximum file size for name and product */
    #define MAXC 256   /* maximum length for line (including '\0') */
    
    int main (int argc, char **argv) {
    
        char buf[MAXC] = "";    /* buffer to hold each line */
        FILE *fp = argc > 1 ? fopen (argv[1], "r") : stdin;
    
        if (!fp) {  /* validate file open for reading */
            fprintf (stderr, "error: file open failed '%s'.\n", argv[1]);
            return 1;
        }
    
        while (fgets (buf, MAXC, fp)) {
            int id;                     /* id from file */
            char name[FLDSZ] = "",      /* name */
                product[FLDSZ] = "";    /* product */
            float price;                /* price */
            size_t len = strlen (buf);  /* length of string in buffer */
            if (len == MAXC - 1 && buf[len - 1] != '\n') {  /* check it fit */
                fputs ("error: line too long.\n", stderr);
                return 1;
            }
            if (sscanf (buf, "|%d | %31s | %31s | %f",      /* parse fields */
                        &id, name, product, &price) != 4) {
                fputs ("error: failed to parse line.\n", stderr);
                continue;
            }
            /* output parsed fields */
            printf ("%3d %-10s %-10s %.2f\n", id, name, product, price);
        }
        if (fp != stdin) fclose (fp);   /* close file if not stdin */
    
        return 0;
    }
    

    (the program expects the data filename as the 1st argument, or it will read from stdin if no argument is given)

    Example Use/Output

    $ ./bin/fieldparse ~/dev/src-cpp/tmp/dat/fields.txt
      1 Joe        marbles    1.25
      2 Mike       jacks      13.49
      3 Jill       pail       4.50
    

    All I can tell you is give it a try and let me know what happens.