Search code examples
sortingcsvc++11vectorstl-algorithm

How to sort multiple columns: CSV? c++


I am attempting to sort a CSV file by specifying which column order to sort in:

for example: ./csort 3, 1, 5 < DATA > SORTED_DATA

or ./csort 3, 4, 6, 2, 1, 5 < DATA ...

example line of DATA: 177,27,2,42,285,220

I used a vector split(string str) function to store the columns specified in the arguments which require sorting. Creating a vector:

vector<string> columns {3, 1, 5}; // for example

Not entirely sure how to use this columns vector to proceed with the sorting process; though, I am aware that I could use sort.

sort(v.begin(), v.end(), myfunction);

Solution

  • As I understand your question, you have already parsed your data into 4 vectors, 1 vector per column, and you want to be able to sort your data, specifying the prececedence of the column to sort -- i.e. sort by col1, then col3, then col4...

    What you want to do isn't too difficult, but you'll have to backtrack a bit. There are multiple ways to approach the problem, but here's a rough outline. Based on the level of expertise you exhibit in your question, you might have to look a few terms in the following outline, but if you do you'll have a good flexible solution to your problem.

    1. You want to store your data by row, since you want to sort rows... 4 vector for 4 columns won't help you here. If all 4 elements in the row are going to be a the same type, you could use a std::vector or std::array for the row. std::array is solid if # cols is known compile time, std::vector for runtime. If the types are inhomogeneous, you could use a tuple, or a struct. Whatever type you use, let's call it RowT.

    2. Parse and store into your rows, make a vector of RowT.

    3. Define a function-object which provides the () operator for a left and right hand side of RowT. It must implement the "less than operation" following the precedence you want. Lets call that class CustomSorter.

    Once you have that in place, your final sort will be:

    CustomSorter cs(/*precedence arguments*/);
    std::sort(rows.begin(), rows.end(), cs);
    

    Everything is really straightforward, a basic example can bee seen here in the customsort example. In my experience the only part you will have to work at is the sort algorithm itself.