I want to read a csv data to vector of struct in cpp, This is what I wrote, I want to store the iris dataset in pointer of struct vector csv std::vector<Csv> *csv = new std::vector<Csv>;
#include <vector>
#include <iostream>
#include <fstream>
#include <string>
#include <sstream>
struct Csv{
float a;
float b;
float c;
float d;
std::string e;
};
int main(){
std::string colname;
// Iris csv dataset downloaded from
// https://gist.github.com/curran/a08a1080b88344b0c8a7
std::ifstream *myFile = new std::ifstream("iris.csv");
std::vector<Csv> *csv = new std::vector<Csv>;
std::string line;
// Read the column names
if(myFile->good())
{
// Extract the first line in the file
std::getline(*myFile, line);
// Create a stringstream from line
std::stringstream ss(line);
// Extract each column name
while(std::getline(ss, colname, ',')){
std::cout<<colname<<std::endl;
}
}
// Read data, line by line
while(std::getline(*myFile, line))
{
// Create a stringstream of the current line
std::stringstream ss(line);
}
return 0;
}
I dont know how to implement this part of the code which outputs line with both float and string.
// Read data, line by line
while(std::getline(*myFile, line))
{
// Create a stringstream of the current line
std::stringstream ss(line);
}
Evolution
We start with you program and complete it with your current programm style. Then we analyze your code and refactor it to a more C++ style solution. In the end we show a modern C++ solution using more OO methods.
First your completed code:
#include <vector>
#include <iostream>
#include <fstream>
#include <string>
#include <sstream>
struct Csv {
float a;
float b;
float c;
float d;
std::string e;
};
int main() {
std::string colname;
// Iris csv dataset downloaded from
// https://gist.github.com/curran/a08a1080b88344b0c8a7
std::ifstream* myFile = new std::ifstream("r:\\iris.csv");
std::vector<Csv>* csv = new std::vector<Csv>;
std::string line;
// Read the column names
if (myFile->good())
{
// Extract the first line in the file
std::getline(*myFile, line);
// Create a stringstream from line
std::stringstream ss(line);
// Extract each column name
while (std::getline(ss, colname, ',')) {
std::cout << colname << std::endl;
}
}
// Read data, line by line
while (std::getline(*myFile, line))
{
// Create a stringstream of the current line
std::stringstream ss(line);
// Extract each column
std::string column;
std::vector<std::string> columns{};
while (std::getline(ss, column, ',')) {
columns.push_back(column);
}
// Convert
Csv csvTemp{};
csvTemp.a = std::stod(columns[0]);
csvTemp.b = std::stod(columns[1]);
csvTemp.c = std::stod(columns[2]);
csvTemp.d = std::stod(columns[3]);
csvTemp.e = columns[4];
// STore new row data
csv->push_back(csvTemp);
}
// Show everything
for (const Csv& row : *csv)
std::cout << row.a << '\t' << row.b << '\t' << row.c << '\t' << row.d << '\t' << row.e << '\n';
return 0;
}
The question that you have regarding the reading of the columns from your Csv file, can be answered like that:
You need a temporary vector. Then you use the std::getline
function, to split the data in the std::istringstream
and to copy the resulting substrings into the vector. After that, we use string conversion functions and assign the rsults in a temporary Csv struct variable. After all conversions have been done, we move the temporary into the resulting csv vector that holds all row data.
Analysis of the program.
First, and most important, in C++ we do not use raw pointers for owned memory. We should ven not use new
in most case. If at all, std::unique_ptr
and std::make_unique
should be used.
But we do not need dynamic memory allocation on the heap at all. You can simply define the std::vector
on the functions stack. Same like in your line std::string colname;
you can also define the std::vector
and the std::ifstream
as a normal local variable. Like for example std::vector<Csv> csv{};
. Only, if you pass this variable to another function, then use pointers, but smart pointers.
Next, if you open a file, like in std::ifstream myFile("r:\\iris.csv");
you do not need to test the file streams condition with if (myFile->good())
. The std::fstream
s bool operator is overwritten, to give you exactly this information. Please see here.
Now, next and most important.
The structure of your source file is well known. There is a header with 5 elements and then 4 doubles and at then end a string without spaces. This makes life very easy.
If we would need to validate the input or if there would be spaces within an string, then we would need to implement other methods. But with this structure, we can use the build in iostream facilities. The snippet
// Read all data
Csv tmp{};
char comma;
while (myFile >> tmp.a >> comma >> tmp.b >> comma >> tmp.c >> comma >> tmp.d >> comma >> tmp.e)
csv.push_back(std::move(tmp));
will do the trick. Very simple.
So, the refactored solution could look like this:
#include <vector>
#include <iostream>
#include <fstream>
#include <string>
#include <sstream>
struct Csv {
float a;
float b;
float c;
float d;
std::string e;
};
int main() {
std::vector<Csv> csv{};
std::ifstream myFile("r:\\iris.csv");
if (myFile) {
if (std::string header{}; std::getline(myFile, header)) std::cout << header << '\n';
// Read all data
Csv tmp{};
char comma;
while (myFile >> tmp.a >> comma >> tmp.b >> comma >> tmp.c >> comma >> tmp.d >> comma >> tmp.e)
csv.push_back(std::move(tmp));
// Show everything
for (const Csv& row : csv)
std::cout << row.a << '\t' << row.b << '\t' << row.c << '\t' << row.d << '\t' << row.e << '\n';
}
return 0;
}
This is already much more compact. But there is more . . .
In the next step, we want to add a more Object Oriented approch.
The key is that data and methods, operating on this data, should be encapsulated in an Object / class / struct. Only the Csv struct should know, how to read and write its data.
Hence, we overwrite the extractor and inserter operator for the Csv struct. We use the same approach than before. We just encapsulate the reading and writing in the struct Csv.
After that, the main function will be even more compact and the usage is more logical.
Now we have:
#include <vector>
#include <iostream>
#include <fstream>
#include <string>
struct Csv {
float a;
float b;
float c;
float d;
std::string e;
friend std::istream& operator >> (std::istream& is, Csv& c) {
char comma;
return is >> c.a >> comma >> c.b >> comma >> c.c >> comma >> c.d >> comma >> c.e;
}
friend std::ostream& operator << (std::ostream& os, const Csv& c) {
return os << c.a << '\t' << c.b << '\t' << c.c << '\t' << c.d << '\t' << c.e << '\n';
}
};
int main() {
std::vector<Csv> csv{};
if (std::ifstream myFileStream("r:\\iris.csv"); myFileStream) {
if (std::string header{}; std::getline(myFileStream, header)) std::cout << header << '\n';
// Read all data
Csv tmp{};
while (myFileStream >> tmp)
csv.push_back(std::move(tmp));
// Show everything
for (const Csv& row : csv)
std::cout << row;
}
return 0;
}
OK. Alread rather good. Bit there is even more possible.
We can see that the source data has a header and then Csv data.
Also this can be modelled into a struct. We call it Iris. And we also add an extractor and inserter overwrite to encapsulate all IO-operations.
Additionally we use now modern algorithms, regex, and IO-iterators. I am not sure, if this is too complex now. If you are interested, then I can give you further information. But for now, I will just show you the code.
#include <vector>
#include <iostream>
#include <fstream>
#include <string>
#include <algorithm>
#include <regex>
#include <iterator>
const std::regex re{ "," };
struct Csv {
float a;
float b;
float c;
float d;
std::string e;
// Overwrite extratcor for simple reading of data
friend std::istream& operator >> (std::istream& is, Csv& c) {
char comma;
return is >> c.a >> comma >> c.b >> comma >> c.c >> comma >> c.d >> comma >> c.e;
}
// Ultra simple inserter
friend std::ostream& operator << (std::ostream& os, const Csv& c) {
return os << c.a << "\t\t" << c.b << "\t\t" << c.c << "\t\t" << c.d << "\t\t" << c.e << '\n';
}
};
struct Iris {
// Iris data consits of header and then Csv Data
std::vector<std::string> header{};
std::vector<Csv> csv{};
// Overwrite extractor for generic reading from streams
friend std::istream& operator >> (std::istream& is, Iris& i) {
// First read header values;
if (std::string line{}; std::getline(is, line))
std::copy(std::sregex_token_iterator(line.begin(), line.end(), re, -1), {}, std::back_inserter(i.header));
// Read all csv data
std::copy(std::istream_iterator<Csv>(is), {}, std::back_inserter(i.csv));
return is;
}
// Simple output. Copy data to stream os
friend std::ostream& operator << (std::ostream& os, const Iris& i) {
std::copy(i.header.begin(), i.header.end(), std::ostream_iterator<std::string>(os, "\t")); std::cout << '\n';
std::copy(i.csv.begin(), i.csv.end(), std::ostream_iterator<Csv>(os));
return os;
}
};
// Driver Code
int main() {
if (std::ifstream myFileStream("r:\\iris.csv"); myFileStream) {
Iris iris{};
// Read all data
myFileStream >> iris;
// SHow result
std::cout << iris;
}
return 0;
}
Look at the main function and how easy it is.
If you have questions, then please ask.
Language: C++17
Compiled and tested with MS Visual Studio 2019, community edition